<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Data Imaginist</title>
<link>https://tutor-church-15580.netlify.app/</link>
<atom:link href="https://tutor-church-15580.netlify.app/index.xml" rel="self" type="application/rss+xml"/>
<description></description>
<generator>quarto-1.5.57</generator>
<lastBuildDate>Wed, 28 Feb 2024 00:00:00 GMT</lastBuildDate>
<item>
  <title>A bunch of giraffes, all bundled up</title>
  <link>https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/</link>
  <description><![CDATA[ 




<p><img src="https://tutor-church-15580.netlify.app/assets/img/ggraph_logo.png" class="img-fluid" style="display:none;"></p>
<p>My return to ggraph development was not supposed to end out with something deserving of a blog post. It was supposed to be a quick triage of bugs to quench my bad conscience for not having looked at the package for quite some time (well, the instigator was the new ggplot2 release which required some changes in ggraph). Yet, one thing lead to another and now I’m sitting here, writing a release post indicating that it turned out to be more than just a series of bug fixes.</p>
<p>So, while this is certainly not a monumental release, let’s celebrate the fact that some very welcome additions managed to lure me into a proper update of the package.</p>
<p>What is ggraph? If you came to this blog post not knowing what this is all about (and made it past the two rambling top paragraphs) you have shown an impressive tenacity towards my R package work. ggraph is a ggplot2 extension for visualising relational data (networks, graph, hierarchies, etc.). It is one of the most versatile frameworks for creating network visualisation and all around a great package. You can learn more about it <a href="https://ggraph.data-imaginist.com">on it’s webpage</a>, which also includes extensive documentation of it’s features.</p>
<p>If the above is old news to you you are probably sitting patiently waiting for me to tell you what is inside this new release. Wait no more…</p>
<section id="spatial-layouts" class="level2">
<h2 class="anchored" data-anchor-id="spatial-layouts">Spatial layouts</h2>
<p>Some time ago sfnetworks was developed on top of tidygraph to handle spatial network with the tidygraph API. Thanks to a PR from <a href="https://github.com/loreabad6">Lorena Crespo</a> ggraph now works natively with this class. The layout itself is pretty simple as it takes the node location already stored in the object and uses them as is. But the layout is also accompanied by a new node and a new edge geom that ensures that the correct CRS is used during plotting etc. The basic use goes something like this:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggraph)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidygraph)</span>
<span id="cb1-3"></span>
<span id="cb1-4">gr <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> sfnetworks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as_sfnetwork</span>(sfnetworks<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>roxel)</span>
<span id="cb1-5"></span>
<span id="cb1-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(gr, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'sf'</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb1-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_sf</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> type)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb1-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/index_files/figure-html/unnamed-chunk-1-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>This can of course be used together with other sf layers for decorations and such (e.g.&nbsp;city boundaries) using <code>geom_sf()</code> from ggplot2.</p>
<p>While you are often going for as correct a representation of the location data as possible when working with spatial data, there are situations where a more stylized look is wanted. One such situation is for railroad and metro maps where the standard has long been to prefer legibility over correctness. ggraph now has a layout that places nodes in a manner akin to what we expect for these types of maps. It is, as many of the layouts in ggraph, provided through the graphlayouts package by <a href="https://github.com/schochastics">David Schoch</a> and, while it is a bit finicky, it can provide a great starting point for a grid-like graph layout.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">gr <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as_tbl_graph</span>(graphlayouts<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>metro_berlin) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb2-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">convert</span>(to_simple)</span>
<span id="cb2-3"></span>
<span id="cb2-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(gr, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'metro'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> lat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> lon, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">grid_space =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.005</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb2-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_link</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">width =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb2-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb2-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'white'</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb2-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>()</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="hierarchical-layouts" class="level2">
<h2 class="anchored" data-anchor-id="hierarchical-layouts">Hierarchical layouts</h2>
<p>ggraph already has ample of layout choices if your data is hierarchical, and now you are spoiled for even more.</p>
<p>Cactustree is a layout that, if you squint your eyes and are a bit imaginative, resembles a cactus. While that sounds a bit odd at first, it makes pretty good sense once you see it. The layout was developed with hierarchical edge bundling in mind, so while it can certainly be used to show hierarchical relations there are probably better layouts for that if that is your only concern.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">gr <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tbl_graph</span>(flare<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>vertices, flare<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>edges) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> stringr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">str_match</span>(name, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"flare</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\\</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">.(</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\\</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">w+)"</span>)[,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>])</span>
<span id="cb3-3">from <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">match</span>(flare<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>imports<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>from, flare<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>vertices<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>name)</span>
<span id="cb3-4">to <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">match</span>(flare<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>imports<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>to, flare<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>vertices<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>name)</span>
<span id="cb3-5"></span>
<span id="cb3-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(gr, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'cactustree'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">scale_factor =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb3-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_circle</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> class), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">colour =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">NA</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">alpha =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">show.legend =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">FALSE</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb3-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_conn_bundle</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">alpha =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">after_stat</span>(index)), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">data =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_con</span>(from, to)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb3-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_edge_alpha</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">range =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">guide =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'none'</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb3-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>()</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>While the above layout is certainly flashy, the next one is not. The H tree layout is a space filling layout that can only be used for binary trees, so it’s application is quite limited. But, if you have a binary tree you need to show, this is your friend:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">gr <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">create_tree</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1023</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-2"></span>
<span id="cb4-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(gr, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"htree"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_link</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb4-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">filter =</span> leaf))</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="other-layout-goodies" class="level2">
<h2 class="anchored" data-anchor-id="other-layout-goodies">Other layout goodies</h2>
<p>Some of the existing layouts have been updated with new features, worthy of a mention.</p>
<p>The linear layout now has a <code>weight</code> argument that can control the spacing between points. In conjunction with now outputting enough information for use with rect and arc nodes this opens up for some new possibilities</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">gr <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">create_notable</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Meredith'</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb5-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">convert</span>(to_directed) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb5-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">class =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sample</span>(letters[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>], <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">n</span>(), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">replace =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>),</span>
<span id="cb5-4">         <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">size =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pmax</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rnorm</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">n</span>())),</span>
<span id="cb5-5">         <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">amount =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">runif</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">n</span>()))</span>
<span id="cb5-6"></span>
<span id="cb5-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(gr, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"linear"</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">circular =</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">TRUE</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">weight =</span> size) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-8">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_arc</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_node_arc_bar</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">r =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> amount<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> class)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb5-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_fixed</span>()</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>The other updates comes courtesy of new functionality in the graphlayouts package and brings the layouts provided by ggraph up to speed with the implementations in graphlayouts. This means that the focus and centrality layout gets a <code>group</code> argument that allows grouping of kindled nodes in these two layouts. Further, the stress layout (the default layout in ggraph) gains an <code>x</code> and <code>y</code> argument which can be used to fix some (or all) nodes in one or two dimensions. If either is given then <code>NA</code> values indicates that a node should be placed by the layout algorithm, given the constraints of the fixed nodes.</p>
</section>
<section id="all-them-bundles" class="level2">
<h2 class="anchored" data-anchor-id="all-them-bundles">All them bundles</h2>
<p>We talked about hierarchical edge bundling back when I showed the cactustree layout. While that was the first (I believe) type of edge bundling it did suffer from the fact that it needed an underlying hierarchical structure for the bundles to work. This created a disconnect between the graph the layout was created on and the edges that was shown (which is why they are drawn with <code>geom_conn_*()</code> not <code>geom_edge_*()</code> functions) that has later been sought to remove. This has created a bunch of different generalised edge bundling techniques and ggraph now supports a few thanks mainly to David Schoch (again).</p>
<p>The force bundling techniques treats edges as springs that attract each other if they run in parallel (it’s a bit more involved but that is the main gist). It was one of the first techniques to be developed and suffers from two main points. First, it is computationally expensive. In ggraph it is implemented with memoisation so that you don’t recalculate it again and again, but the first pass can be taxing for larger networks. Second, the bundling doesn’t really use any topological information when performing the bundling, and unrelated edges can thus end up in bundles together indicating interaction where none exist.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1">gr <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as_tbl_graph</span>(edgebundle<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span>us_flights)</span>
<span id="cb6-2">states <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map_data</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"state"</span>)</span>
<span id="cb6-3"></span>
<span id="cb6-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(gr, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> longitude, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> latitude) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_polygon</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(long, lat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> group), states, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'white'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">linewidth =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">crs =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'NAD83'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">default_crs =</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_crs</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4326</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb6-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_bundle_force</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'white'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">width =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>If the above stated caveats have made you skeptic, ggraph also provides an alternative bundling technique that tackles both of them. The edge path bundling algorithm doesn’t use any attracting forces when bundling. Instead it directs edges through their shortest path on an increasingly sparse version of the input graph. This, again, results in bundling, but this time the topology of the graph is being used so the bundles should to a larger degree make sense. It is also much faster to compute.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(gr, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> longitude, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> latitude) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb7-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_polygon</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(long, lat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> group), states, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'white'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">linewidth =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb7-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">crs =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'NAD83'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">default_crs =</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_crs</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4326</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb7-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_bundle_path</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'white'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">width =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>In every way an improvement. However, remember that, just like with layouts, there is no single right answer when it comes to edge bundling. You are introducing a bias to the representation and trying out different approaches is always a good idea.</p>
<p>The last bundling technique is very quick and dirty and a home invention of mine. It works much like the edge path bundling but instead of gradually removing edges from the graph where the shortest path is searched for, they are all found in the minimal spanning tree so it can be done in one go. This makes it the most performant of the three but suffers from forcing a tree-like structure onto the topology that the edges follows. It usually also requires a higher <code>max_distortion</code> setting since the minimal spanning tree forces edges on a larger detour.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(gr, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> longitude, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> latitude) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb8-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_polygon</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(long, lat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> group), states, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'white'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">linewidth =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb8-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">crs =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'NAD83'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">default_crs =</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_crs</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4326</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb8-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_bundle_minimal</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'white'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">width =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">max_distortion =</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>All in all, the edge bundling support has been greatly enhanced. I’d still like to add a technique that better splits out edges going in opposite direction but that will be for another release. Edge path bundling does treat directed graphs differently since the shortest path is direction dependent but there are also other techniques that are worth exploring</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># This network doesn't really make sense to view as directed but we do it anyway</span></span>
<span id="cb9-2"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># to show the difference in output</span></span>
<span id="cb9-3">gr <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> gr <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">convert</span>(to_directed)</span>
<span id="cb9-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggraph</span>(gr, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">x =</span> longitude, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> latitude) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb9-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_polygon</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(long, lat, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> group), states, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'white'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">linewidth =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb9-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_sf</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">crs =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'NAD83'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">default_crs =</span> sf<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">st_crs</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4326</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb9-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_edge_bundle_path</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">color =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'white'</span>, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">width =</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/index_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="wrapping-up" class="level2">
<h2 class="anchored" data-anchor-id="wrapping-up">Wrapping up</h2>
<p>That’s about it. The release of course also includes numerous bug fixes, which was the whole reason why I started working on it in the first place. A lot of the new features presented couldn’t have happened without the work of David Schoch who has made great contributions to the network support in R and in tidygraph and ggraph in particular. Also a big thanks to the people working on sfnetworks and Lorena Crespo in particular for adding support in ggraph.</p>


</section>

 ]]></description>
  <category>ggraph</category>
  <category>announcement</category>
  <category>package</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2024-02-15-ggraph-2-2-0/</guid>
  <pubDate>Wed, 28 Feb 2024 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/ggraph_logo.png" medium="image" type="image/png" height="68" width="144"/>
</item>
<item>
  <title>A small patch of free features</title>
  <link>https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/</link>
  <description><![CDATA[ 




<p><img src="https://tutor-church-15580.netlify.app/assets/img/patchwork_logo.png" class="img-fluid" style="display:none;"></p>
<p>What is that? Another blog post not even a month after the last? This feels like 2017. Maybe I’m a bit extra attentive because I’ve had fun porting over my blog to quarto and also finally building a proper <a href="https://thomaslinpedersen.art">site for my generative art</a> rather than lumping it into my R/OSS blog. Or maybe I just finally have interesting to share for the first time in a while…</p>
<p>That <em>interesting</em> thing today is a new release of <a href="https://patchwork.data-imaginist.com">patchwork</a> — my package for easily combining multiple plots into complex and well-aligned compositions. It is not the grandest of releases — after all the package does what it does well — but it does provide two new features that I’ve been looking forward to:</p>
<section id="there-can-be-only-one-axis" class="level2">
<h2 class="anchored" data-anchor-id="there-can-be-only-one-axis">There can be only one (axis)</h2>
<p>One of the features in patchwork I’m particularly fond of is it’s ability to collect and de-duplicate legends. It is one of those touches that makes the final composition feel like a whole. Missing from this has been a similar function for axes. This has been even more glaring because we are used to de-duplicated axes from faceted plots and not having that in patchwork felt wrong. I always intended on adding this but never got around to it but thankfully <a href="https://github.com/teunbrand">Teun van den Brand</a> took a stab at it and filled the gap.</p>
<p>This new functionality is two-fold as it is split up in axes and axis titles (though the setting for axis titles defaults to that for axes so you can usually get by only setting it for axes).</p>
<p>Consider these two plots:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(patchwork)</span>
<span id="cb1-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb1-3"></span>
<span id="cb1-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(ggplot2)</span>
<span id="cb1-5">p1 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(mtcars) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb1-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_point</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(mpg, disp)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb1-7">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggtitle</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Plot 1'</span>)</span>
<span id="cb1-8"></span>
<span id="cb1-9">p2 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(mtcars) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb1-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_boxplot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(gear, disp, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">group =</span> gear)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> </span>
<span id="cb1-11">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggtitle</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Plot 2'</span>)</span>
<span id="cb1-12"></span>
<span id="cb1-13">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> p2</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-1-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>As we can see they share the exact same y-axis and you might want to avoid the visual clutter of keeping the axis of the rightmost plot. Of course you could remove it through theming, setting the relevant theme elements to <code>element_blank()</code>. But that is such a hassle! Using the axis collecting is much easier:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> p2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_layout</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axes =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"collect"</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-2-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>If you like the clarity of the axis but prefer to not keep the title, you use the <code>axis_titles</code> argument instead</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> p2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_layout</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axis_titles =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"collect"</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-3-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Titles are collected if they are identical and the same is true for axes. This means that if you have two plots showing the same on the y-axis but with different ranges you can collect the titles but not the axis</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> p2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">coord_cartesian</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">ylim =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">300</span>)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_layout</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axes =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"collect"</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>There is no facility to align the range of axes across plots so you’d still need to keep an eye on that. Still, you can always use <code>&amp;</code> to apply the same coordinate system or scale to all plots in a patchwork so it should be relatively easy to line up plots.</p>
<p>One difference from the legend collection is that collecting axes only works for plots in the same nesting level. There are reasons for this, mainly my sanity level and capacity to sleep at night. Still, it means that one should be aware of the “hidden” nesting that can occur when using <code>/</code> and <code>|</code> for composition:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb5-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span> p2) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_layout</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axes =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"collect"</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>A better approach for this would be to keep the same nesting level but use the <code>widths</code> argument to get the same look</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> p2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_layout</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">widths =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axes =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"collect"</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-6-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>The attentive reader will observe that apart from “fixing” the problem at hand, something else happened to the plot. The middle plot suddenly lost it’s x-axis title and the x-axis title of the left plot got moved somewhat to the right. This is because axis title collecting works in both directions, i.e.&nbsp;if adjacent axis titles are identical they will get merged and the final title will occupy the full area of the merged ones. The effect may be more clear in a simpler layout:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb7-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> p2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_layout</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axis_titles =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"collect"</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-7-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>For the prior plot, if we would like to avoid this behavior because it is not obvious which x-axis title the middle plot relates to, we can set the collecting to only happen in one direction</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> p2 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_layout</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">widths =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">axes =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"collect_y"</span>)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="being-free-from-constraint" class="level2">
<h2 class="anchored" data-anchor-id="being-free-from-constraint">Being free from constraint</h2>
<p>The other feature I’ll discuss will probably make a lot of people happy. The number of questions about how to <em>not</em> align plots are numerous and usually comes down to plots with excessively long y-axis labels (sorry for keeping with the mtcars dataset — I know we got it figured out quite well at this point):</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb9-1">p3 <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ggplot</span>(mtcars) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">geom_bar</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">aes</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">y =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(gear), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">fill =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">factor</span>(gear))) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span></span>
<span id="cb9-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scale_y_discrete</span>(</span>
<span id="cb9-4">    <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">""</span>,</span>
<span id="cb9-5">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">labels =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">c</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"3 gears are often enough"</span>,</span>
<span id="cb9-6">               <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"But, you know, 4 is a nice number"</span>,</span>
<span id="cb9-7">               <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I would def go with 5 gears in a modern car"</span>)</span>
<span id="cb9-8">  )</span>
<span id="cb9-9">p3</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-9-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>We can see how such a plot could mess up a composition</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb10-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> p3</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-10-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>My answer to these questions/issues has always been to use <code>wrap_elements()</code> which, to be fair, gets the job done OK’ish</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb11-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">wrap_elements</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">plot =</span> p3)</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-11-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>However, there are some shortcomings to this approach. First, it is pretty verbose and not very descriptive of what it does/what your intent is. This is not the end of the world, but the API of patchwork is pretty great (IMHO) so it feels like a bad concession to give all that up here. Second, using <code>wrap_elements()</code> “freezes” the plot inside it, so you can no longer modify it, e.g.&nbsp;with <code>&amp;</code> or through guide collecting:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb12-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">wrap_elements</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">plot =</span> p3) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_layout</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">guides =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"collect"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&amp;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_dark</span>()</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-12-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>Another thing is that the plot margin is part of the plot that gets inserted into the plot region. If we remove the legend and increase the margin we can see an annoying misalignment between the right edges of the plots:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb13-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">wrap_elements</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">plot =</span> p3 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">plot.margin =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">margin</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>), <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">legend.position =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"none"</span>))</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-13-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>That was a lot of dunking on <code>wrap_elements()</code>. This is mainly because it was the wrong tool for the job, not because there is anything particularly wrong with it as is. No matter, we now have the right tool:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb14-1">p1 <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">free</span>(p3) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot_layout</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">guides =</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"collect"</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&amp;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">theme_dark</span>()</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/index_files/figure-html/unnamed-chunk-14-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
<p>There is not much more to it. Wrap a plot in <code>free()</code> if you want to forego the alignment that patchwork performs and it will do exactly that without getting in the way of the other functionality in the patchwork.</p>
<p>And now it is time to leave mtcars alone. Happy plotting!</p>


</section>

 ]]></description>
  <category>patchwork</category>
  <category>announcement</category>
  <category>package</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2024-01-05-patchwork-1-2-0/</guid>
  <pubDate>Mon, 08 Jan 2024 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/patchwork_logo.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>A new focus on tidygraph</title>
  <link>https://tutor-church-15580.netlify.app/posts/2023-12-18-a-new-focus-on-tidygraph/</link>
  <description><![CDATA[ 




<p>I’m pleased to announce a new release of <a href="https://tidygraph.data-imaginist.com">tidygraph</a>. It has been a while since something major has happened to the package, reflecting the stable nature of it, but this time I felt like doing a bit more than just brush it of for the occasional upstream dependency change. So, while it is in no way a grandiose release, it does contain enough new stuff to warrant a small blog post. If you are a tidygraph user you should definitely read on, otherwise perhaps explore the <a href="https://tidygraph.data-imaginist.com">project website</a> first and become a user.</p>
<section id="let-us-focus-on-the-news" class="level2">
<h2 class="anchored" data-anchor-id="let-us-focus-on-the-news">Let us focus on the news</h2>
<p>One new feature I’m particularly exited about is the inclusion of a new <code>focus()</code>/<code>unfocus()</code> pair of verbs. Part of my excitement is that this was one of my original ideas for the package but was scraped prior to release and then left to linger. The other reason is of course that it is super useful. So what does it do?</p>
<p>Let’s start with the why. For classic tabular data you generally expect all data to be equally important during computations. Each row is an observation that needs to be treated with the same care. You perhaps do some filtering but for the resulting filter, it again holds that each data is equally important. For such data the vectorised approach of R (and thus dplyr) makes perfect sense. We tend to want to calculate stuff for each row. The same is not always true for graph data. We might have nodes that are the main focus of our attention and nodes that are simply auxillary. But performing a filter will alter our graph, and that might change our calculations due to the connectedness of our data. For many calculations this is of little concern as the algorithms are so performant, meaning the vectorised paradigm of tidygraph is fine - we simply ignore it. But, what if we have a huge graph and an algorithm that scales exponentially with the number of edges and we really are only interested in the result of a few nodes or edges?</p>
<p>Enter the <code>focus()</code> verb. It allows you to perform a temporary filtering of the nodes or edges you are working on without removing the underlying graph structure. In practise it means that any tidygraph algorithms will only be called on the nodes or edges that are in focus but the algorithms will have access to the full graph and will thus return the same result for the focused nodes/edges irrespective of whether the focus was applied or not.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">library</span>(tidygraph)</span>
<span id="cb1-2">graph <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">play_forestfire</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e5</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-3">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">important =</span> dplyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">row_number</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-4">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">focus</span>(important) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-5">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">efficiency =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">node_efficiency</span>()) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-6">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unfocus</span>()</span>
<span id="cb1-7"></span>
<span id="cb1-8">graph <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-9">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">as_tibble</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb1-10">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">slice</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tibble: 10 × 2
   important efficiency
   &lt;lgl&gt;          &lt;dbl&gt;
 1 TRUE          0.0253
 2 TRUE          0.0274
 3 TRUE          0.0417
 4 TRUE          0.0306
 5 TRUE          0.0368
 6 FALSE        NA     
 7 FALSE        NA     
 8 FALSE        NA     
 9 FALSE        NA     
10 FALSE        NA     </code></pre>
</div>
</div>
<p>In the above code we calculate the local efficiency around each node, but since we are only interested in this measure for the first 5 nodes we focus on these and avoid computing it for the remaining 99995 nodes, gaining quite a speed boost. One (huge) caveat is that it is algorithm-dependent whether focusing on a subset provides a performance gain. Some algorithms work in a way were everything is calculated together, e.g.&nbsp;those that rely on convolutions of the distance matrix etc. In these cases no performance gain will be seen.</p>
<p>Focus can be applied both to nodes and edges depending on which one is activated. The focus is the weakest of all graph states and a graph will be unfocused if you either activate, group, or morph a graph so think of it as the most temporary state of them all.</p>
</section>
<section id="iterating-on-old-ideas" class="level2">
<h2 class="anchored" data-anchor-id="iterating-on-old-ideas">Iterating on old ideas</h2>
<p>Another old feature idea of mine that finally materialized is a set of <code>iterate_*()</code> verbs. Those are quite a bit simpler but useful nonetheless if you want to encode simple simulations on graphs using tidygraph syntax. You can think of these as functional equivalents of <code>while () {}</code> and <code>for () {}</code> so you can incorporate them into a pipe. As an example let’s consider a simulation that removes an edge unless it isolates one of its nodes:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1">unwire <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span>(graph) {</span>
<span id="cb3-2">  edge <span class="ot" style="color: #003B4F;
background-color: null;
font-style: inherit;">&lt;-</span> graph <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-3">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">activate</span>(nodes) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-4">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">well_connected =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">centrality_degree</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-5">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">activate</span>(edges) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-6">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mutate</span>(<span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">can_remove =</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">.N</span>()<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>well_connected[from] <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&amp;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">.N</span>()<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">$</span>well_connected[to],</span>
<span id="cb3-7">           <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">will_remove =</span> dplyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">row_number</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sample</span>(dplyr<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">row_number</span>(), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>L, <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">prob =</span> can_remove)) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-8">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pull</span>(will_remove)</span>
<span id="cb3-9">  graph <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-10">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">activate</span>(edges) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb3-11">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">filter</span>(<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>edge)</span>
<span id="cb3-12">}</span></code></pre></div>
</div>
<p>We can use this function 20 times on our graph with the <code>iterate_n()</code> verbs like so:</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">create_notable</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'meredith'</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb4-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">iterate_n</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>, unwire)</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code># A tbl_graph: 70 nodes and 120 edges
#
# An undirected simple graph with 1 component
#
# Node Data: 70 × 0 (active)
#
# Edge Data: 120 × 2
   from    to
  &lt;int&gt; &lt;int&gt;
1     1     5
2     1     6
3     1     7
# ℹ 117 more rows</code></pre>
</div>
</div>
<p>Alternatively we can set up a condition to test for after each iteration that determines if iteration continues. Below we run the <code>unwire()</code> function until the graph has been split up into two components.</p>
<div class="cell">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb6-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">create_notable</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'meredith'</span>) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb6-2">  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">iterate_while</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">graph_component_count</span>() <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, unwire) <span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> </span>
<span id="cb6-3">  ggraph<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">autograph</span>()</span></code></pre></div>
<div class="cell-output-display">
<div>
<figure class="figure">
<p><img src="https://tutor-church-15580.netlify.app/posts/2023-12-18-a-new-focus-on-tidygraph/index_files/figure-html/unnamed-chunk-5-1.png" class="img-fluid figure-img" width="672"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="catching-up" class="level2">
<h2 class="anchored" data-anchor-id="catching-up">Catching up</h2>
<p>It’s been a while since tidygraph has been updated with interfaces into new features from igraph. This release fixes that somewhat by providing the following new functions:</p>
<ul>
<li><p><code>edge_is_bridge()</code> will test for whether edges are bridges (their removal will result in splitting up a component into two</p></li>
<li><p><code>edge_is_feedback_arc()</code> queries whether edges are part of the feedback arc set</p></li>
<li><p><code>graph_is_eulerian()</code> and <code>edge_rank_eulerian()</code> provides access to eulerian path and cycle calculations</p></li>
<li><p><code>graph_efficiency()</code> and <code>node_efficiency()</code> provides access to global and local efficiency calculations</p></li>
<li><p><code>group_leiden()</code> and <code>group_fluid()</code> provides access to the new <code>cluster_leiden()</code> and <code>cluster_fluid_communities()</code> community detection algorithms</p></li>
<li><p><code>group_color()</code> provides an interface to graph coloring. While not really a clustering algorithm the output matches closely with those as it provides a single id to each node</p></li>
<li><p><code>centrality_harmonic()</code> supersedes <code>centrality_closeness_harmonic()</code> using an efficient C implementation over the flexible but slower implementation from the netrankr package</p></li>
<li><p><code>random_walk_rank()</code> provides access to random walks on both edges and nodes</p></li>
<li><p><code>to_largest_component()</code> and <code>to_random_spanning_tree()</code> are two new morphers</p></li>
<li><p><code>node_is_connected()</code> tests whether nodes are connected to all or any of the nodes in a given set</p></li>
</ul>
<p>Apart from changes in igraph, tidygraph also needs to stay somewhat current to another package, namely dplyr. In this release we have added support for the various <code>slice_*()</code> types so that you can now use e.g.&nbsp;<code>slice_min()</code> or <code>slice_sample()</code> on tbl_graph objects. And while not directly dplyr (but tidyr) you can now use <code>replace_na()</code> and <code>drop_na()</code> with tbl_graph objects as well.</p>
</section>
<section id="wrapping-up" class="level2">
<h2 class="anchored" data-anchor-id="wrapping-up">Wrapping up</h2>
<p>Mature packages are a weird thing as a developer. You seldom spend much time with them as they are working as intended, even if they are a cornerstone of some of your work. Tidygraph definitely falls into this spot. It was nice to get to relearn it a bit as I prepared this release and I hope the new additions will spark joy. Take care</p>


</section>

 ]]></description>
  <category>tidygraph</category>
  <category>announcement</category>
  <category>package</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2023-12-18-a-new-focus-on-tidygraph/</guid>
  <pubDate>Mon, 18 Dec 2023 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/tidygraph_logo.png" medium="image" type="image/png" height="68" width="144"/>
</item>
<item>
  <title>Say Goodbye to “Good Taste”</title>
  <link>https://tutor-church-15580.netlify.app/posts/2021-03-19-say-goodbye-to-good-taste/</link>
  <description><![CDATA[ 




<script src="../../rmarkdown-libs/header-attrs/header-attrs.js"></script>
<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/ggfx_logo_small.png" align="right" style="width:50%;max-width:200px;margin-left:5pt">
</p>
<p>
I’m excited to announce the first release of the ggfx package, a package that brings R native filtering to grid and ggplot2 for the first time. You can install ggfx with:
</p>
<pre class="r"><code>install.packages('ggfx')</code></pre>
<p>
The purpose of ggfx is to give you access to effects that would otherwise require you to do some heavy post processing in programs such as Photoshop/Gimp or Illustrator/Inkscape, all from within R and as part of your reproducible workflow.
</p>
<section id="what-is-a-filter" class="level3">
<h3 class="anchored" data-anchor-id="what-is-a-filter">
What is a filter?
</h3>
<p>
A filter, in the context of image/photo editing is a function that takes in raster data (i.e.&nbsp;an image rasterised to pixel values) and modifies these pixels somehow, before returning a new image. As such, the idea has seen a lot of traction with apps such as Instagram which allows you to change the look of your photo by applying different filters to it.
</p>
<p>
So, a filter works with pixels. That provide some complications for vector based graphics such as the R graphics engine. Here you really don’t care about pixels, but simply instruct the engine to draw e.g.&nbsp;a circle at a specific position and with a certain radius and colour. The engine never comes in contact with the concept of pixels as it delegates the rendering to a graphics devices which may, or may not, render it as a raster. In many ways this is parallel to how SVG works. SVG also just records instructions which needs to be executed by a renderer (often a browser). Still, SVG have access to a limited amount of filters as part of it’s specification — how does that work? Usually when an SVG is rendered and it includes a filter, the filtered part will be rasterised off-screen, and the filter will be applied before it is all composed together.
</p>
<p>
This is a concept that can be transferred to R, and it is exactly what ggfx does!
</p>
</section>
<section id="meet-the-filters" class="level2">
<h2 class="anchored" data-anchor-id="meet-the-filters">
Meet the filters!
</h2>
<p>
ggfx contains quite a lot of filters - some are pure fun, others will shock you, a few will prove useful. All filters are prefixed as <code>with_</code> to indicate that some graphic element should be rendered <em>with</em> the filter. To show this off, lot’s reach for one of the most easy to understand filters: <em>blur!</em>
</p>
<pre class="r"><code>library(ggplot2)
library(ggfx)

p &lt;- ggplot(mpg) + 
  geom_point(aes(x = hwy, y = displ))

with_blur(p, sigma = 3)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2021-03-19-say-goodbye-to-good-taste_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<p>
We can see that the filter takes a graphic object, along with some filter specific settings, such as <code>sigma</code> which controls the amount of blur applied (specifically the size of the Gaussian kernel being used)
</p>
<p>
Now, it is not that common that you want to apply a filter to the full plot - thankfully, ggfx supports a range of different graphic objects and filters can thus equally be applied to layers:
</p>
<pre class="r"><code>ggplot(mpg) + 
  with_blur(
    geom_point(aes(x = hwy, y = displ)),
    sigma = 3
  )</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2021-03-19-say-goodbye-to-good-taste_files/figure-html/unnamed-chunk-3-1.png" width="672">
</p>
<p>
Other graphic objects that can be filtered are theme elements and guides:
</p>
<pre class="r"><code>ggplot(mpg) + 
  geom_point(aes(x = hwy, y = displ)) + 
  guides(
    x = with_blur(
      guide_axis(),
      sigma = 2
    )
  ) + 
  theme(
    panel.grid.major = with_blur(
      element_line(),
      sigma = 2
    )
  )</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2021-03-19-say-goodbye-to-good-taste_files/figure-html/unnamed-chunk-4-1.png" width="672">
</p>
<p>
With the basic API in mind we can take a look at the different filters:
</p>
<section id="blur-type-filters" class="level3">
<h3 class="anchored" data-anchor-id="blur-type-filters">
Blur type filters
</h3>
<p>
Blur is central to a lot of effect and thus part of many filters:
</p>
<ul>
<li>
<p>
<code>with_blur()</code> as we have already seen, adds a constant blur to everything in it’s layer
</p>
</li>
<li>
<p>
<code>with_variable_blur()</code> allows you to control the amount and angle of blur at each location based on channel values in another layer
</p>
</li>
<li>
<p>
<code>with_motion_blur()</code> adds directional blur in a manner that simulates moving a camera/moving the subject
</p>
</li>
<li>
<p>
<code>with_inner_glow()</code> adds an inner glow effect to all objects in the layer (basically a coloured blur of the surroundings that is only visible on top of the objects
</p>
</li>
<li>
<p>
<code>with_outer_glow()</code> adds an outer glow effect (a coloured blur of the objects that is only visible in the surroundings)
</p>
</li>
<li>
<p>
<code>with_drop_shadow()</code> add a coloured blur underneath the layer with a specific offset
</p>
</li>
<li>
<p>
<code>with_bloom()</code> adds a specific blur effect to all light parts of the layer that simulates strong light spilling out into the surroundings
</p>
</li>
</ul>
</section>
<section id="blend-type-filters" class="level3">
<h3 class="anchored" data-anchor-id="blend-type-filters">
Blend type filters
</h3>
<p>
Users of Photoshop and similar programs knows of the power of blending layers. Usually layers are just placed on top of each others, but that is just one possibility.
</p>
<ul>
<li>
<p>
<code>with_blend()</code> allows you to blend two layers together based on both standard Duff-Porter alpha composition types, as well as others known from image editing programs such as <em>Multiply</em>, <em>Overlay</em>, and <em>Linear Dodge</em>
</p>
</li>
<li>
<p>
<code>with_custom_blend()</code> allows you to specify your own blend operation based on a standard formula coefficient setup
</p>
</li>
<li>
<p>
<code>with_mask()</code> allows you to set a mask on a layer, i.e.&nbsp;specify in which areas the layer is visible
</p>
</li>
<li>
<p>
<code>with_interpolate()</code> interpolates between two layers, fading them together
</p>
</li>
</ul>
</section>
<section id="dithering-type-filters" class="level3">
<h3 class="anchored" data-anchor-id="dithering-type-filters">
Dithering type filters
</h3>
<p>
Dithering is the act of reducing the number of colours used in an image, while retaining the look of the original colour fidelity. This have had uses in both image size reduction and screen printing, but now is mostly used for the particular visual effect it provides.
</p>
<ul>
<li>
<p>
<code>with_dither()</code> applies error correction dithering using the Floyd-Steinberg algorithm
</p>
</li>
<li>
<p>
<code>with_ordered_dither()</code> uses a threshold map of a certain size to create dithering (also called Bayer dithering)
</p>
</li>
<li>
<p>
<code>with_halftone_dither()</code> uses another type of threshold map that simulates halftone/offset printing
</p>
</li>
<li>
<p>
<code>with_circle_dither()</code> uses and alternative threshold map to the above to create more circular shapes
</p>
</li>
<li>
<p>
<code>with_custom_dither()</code> allows you to use a custom threshold map you’ve created for ImageMagick
</p>
</li>
</ul>
</section>
<section id="other-filter-types" class="level3">
<h3 class="anchored" data-anchor-id="other-filter-types">
Other filter types
</h3>
<p>
There’s also a range of filters that defies grouping:
</p>
<ul>
<li>
<p>
<code>with_shade()</code> allows you to shade a layer based on a given heightmap
</p>
</li>
<li>
<p>
<code>with_kernel()</code> allows you to apply a custom kernel convolution to the layer
</p>
</li>
<li>
<p>
<code>with_displace()</code> allows you to displace and distort your layer based an relative displacement values given in another layer
</p>
</li>
<li>
<p>
<code>with_raster()</code> simply rasterises your layer and displays that
</p>
</li>
</ul>
</section>
</section>
<section id="combining-layers" class="level2">
<h2 class="anchored" data-anchor-id="combining-layers">
Combining layers
</h2>
<p>
As may be apparent from the descriptions above, filters sometimes work with multiple layers at the same time. To facilitate this ggfx can create layer references and layer group references which can then be used in another filter. We can showcase this with a blend filter. Below we create a reference to a text layer and blends it together with a polygon layer (through <code>geom_circle()</code> from ggforce) to achieve an effect that would be pretty difficult to have without using filters.
</p>
<pre class="r"><code>library(ggforce)

ggplot() + 
  as_reference(
    geom_text(aes(x = 0, y = 0, label = 'Blend Modes!'), size = 20, family = 'Fontania'),
    id = 'text_layer'
  ) + 
  with_blend(
    geom_circle(aes(x0 = 0, y0 = 0, r = seq_len(5)), fill = NA, size = 8),
    bg_layer = 'text_layer',
    blend_type = 'xor'
  ) + 
  coord_fixed()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2021-03-19-say-goodbye-to-good-taste_files/figure-html/unnamed-chunk-5-1.png" width="672">
</p>
<p>
Filters themselves can also be turned into references by assigning an id to them, which allows the result of a filter to be used in another filter:
</p>
<pre class="r"><code>ggplot() + 
  as_reference(
    geom_text(aes(x = 0, y = 0, label = 'Blend Modes!'), size = 20, family = 'Fontania'),
    id = 'text_layer'
  ) + 
  with_blend(
    geom_circle(aes(x0 = 0, y0 = 0, r = seq_len(5)), fill = NA, size = 8),
    bg_layer = 'text_layer',
    blend_type = 'xor',
    id = 'blended'
  ) + 
  with_inner_glow(
    'blended',
    colour = 'white',
    sigma = 5
  ) +
  coord_fixed()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2021-03-19-say-goodbye-to-good-taste_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
<p>
Above we also see that filters can take references as their main graphic object instead of layers.
</p>
<p>
Some filters use other layers but only to extract variable parameters, e.g.&nbsp;seen in <code>with_variable_blur()</code> and <code>with_displace()</code>. Here we are only interested in the values in a single channel as it can be converted to a single integer value for each pixel. ggfx gives you plenty of choice as to which channel to use with the set of <code>ch_</code> functions which can be applied to the reference. If none is given then the luminosity is used as default. To illustrate this we create a raster layer with the volcano data and applies a rainbow colour scale to it (😱) and then use the red and green channel to displace a circle:
</p>
<pre class="r"><code>volcano_long &lt;- data.frame(
  x = as.vector(col(volcano)),
  y  = as.vector(row(volcano)),
  z = as.vector(volcano)
)
ggplot() + 
  as_reference(
    geom_raster(aes(x = y, y = x, fill = z), volcano_long, interpolate = TRUE, show.legend = FALSE),
    id = 'volcano'
  ) + 
  scale_fill_gradientn(colours = rainbow(15)) + 
  with_displacement(
    geom_circle(aes(x0 = 44, y0 = 31, r = 20), size = 10),
    x_map = ch_red('volcano'),
    y_map = ch_blue('volcano'), 
    x_scale = 5,
    y_scale = 5
  )</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2021-03-19-say-goodbye-to-good-taste_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<p>
A last wrinkle to all this is that you don’t need to use other layers as references. You can use raster objects directly, or even a function that takes the width and height of the plot in pixels and generates a raster.
</p>
<p>
When you are using raster objects you can control how they are placed using an assortment of <code>ras_</code> functions:
</p>
<pre class="r"><code>ggfx_logo &lt;- as.raster(magick::image_read(
  system.file('help', 'figures', 'logo.png', package = 'ggfx')
))

ggplot(mpg) + 
  with_blend(
    geom_point(aes(x = hwy, y = displ), size = 5),
    bg_layer = ras_fit(ggfx_logo, 'viewport'),
    blend_type = 'xor'
  )</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2021-03-19-say-goodbye-to-good-taste_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
<pre class="r"><code>ggplot(mpg) + 
  with_blend(
    geom_point(aes(x = hwy, y = displ), size = 5),
    bg_layer = ras_tile(ggfx_logo, 'viewport', anchor = 'center', flip = TRUE),
    blend_type = 'xor'
  )</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2021-03-19-say-goodbye-to-good-taste_files/figure-html/unnamed-chunk-9-1.png" width="672">
</p>
</section>
<section id="why-oh-why" class="level2">
<h2 class="anchored" data-anchor-id="why-oh-why">
Why, oh why?
</h2>
<p>
Having had a glimpse at what ggfx can do you might sit back, horror struck, asking yourself why I would launch such a full on attack on the purity and simplicity of data visualisation. Surely, this can only be used to impede understanding and, to use a popular term by Edward Tufte, create chart junk.
</p>
<p>
While there is some truth to the idea that data visualisations should communicate its content as clearly as possible, it is only one side of the coin and mainly applies to statistical charts. Data visualisation is also a device for story telling, and here the visual appearance of the chart can serve to underline the story and make the conclusions memorable. Having the artistic means to do that directly in R, in a reproducible manner, instead of being forced to manually edit your chart afterwards, is a huge boon for the graphic ecosystem in R and will set the creativity free in some data visualisation practitioners. If you doubt me, have a look at how ggfx has been used to great effect in the Tidy Tuesday project - even before it has been released proper.
</p>
</section>
<section id="wrapping-up" class="level2">
<h2 class="anchored" data-anchor-id="wrapping-up">
Wrapping up
</h2>
<p>
I’ve only shown a little glimpse at what ggfx can do — if I have piqued your interest I invite you to browse the <a href="https://ggfx.data-imaginist.com">package website</a>. There you can see examples of all the different filters along with articles helping you to implement your own filters from scratch for the ultimate freedom.
</p>
<p>
Now, go out in to the world and make some memorable charts!
</p>
</section>



 ]]></description>
  <category>package</category>
  <category>announcement</category>
  <category>ggfx</category>
  <category>visualization</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2021-03-19-say-goodbye-to-good-taste/</guid>
  <pubDate>Wed, 31 Mar 2021 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/ggfx_logo.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>Insetting a new patchwork version</title>
  <link>https://tutor-church-15580.netlify.app/posts/2020-11-09-insetting-a-new-patchwork-version/</link>
  <description><![CDATA[ 




<script src="../../rmarkdown-libs/header-attrs/header-attrs.js"></script>
<p>
I’m delighted to announce that a new version of patchwork has been released on CRAN. This new version contains both a bunch of small bug fixes as well as some prominent features which will be showcased below.
</p>
<p>
If you are unaware of patchwork, it is a package that allows easy composition of graphics, primarily aimed at ggplot2, but with support for base graphics as well. You can read more about the package on its <a href="https://patchwork.data-imaginist.com">website</a>.
</p>
<p>
For the remainder of this post we’ll use the following plots as examples:
</p>
<pre class="r"><code>library(ggplot2)
library(patchwork)
p1 &lt;- ggplot(mtcars) + 
  geom_point(aes(mpg, disp)) + 
  ggtitle('Plot 1')

p2 &lt;- ggplot(mtcars) + 
  geom_boxplot(aes(gear, disp, group = gear)) + 
  ggtitle('Plot 2')

p3 &lt;- ggplot(mtcars) + 
  geom_point(aes(hp, wt, colour = mpg)) + 
  ggtitle('Plot 3')</code></pre>
<section id="support-for-insets" class="level2">
<h2 class="anchored" data-anchor-id="support-for-insets">
Support for insets
</h2>
<p>
At it’s inception patchwork was mainly designed to deal with alignment of plots displayed in a grid. This focus left out a small, but important for some, functionality for placing plots on top of each other. While it was possible to create a design with overlapping plots by combining different plotting areas:
</p>
<pre class="r"><code>design &lt;- c(area(1, 1, 2, 2), area(2, 2, 3, 3), area(1, 3, 2, 4))
plot(design)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<p>
…this would still enforce an underlying grid, something that would come at odds with freely positioning insets. To make up for this patchwork has now gained an <code>inset_element()</code> function, which marks the given graphics as an inset to be added to the preceding plot. The function allows you to specify the exact location of the edges of the inset in any grid unit you want, thus giving you full freedom of the placement:
</p>
<pre class="r"><code>p1 + inset_element(p2, left = 0.5, bottom = 0.4, right = 0.9, top = 0.8)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-3-1.png" width="672">
</p>
<p>
By default the positions use <code>npc</code> units which goes from 0 to 1 in the chosen area, other units can be used as well, by giving them explicitly:
</p>
<pre class="r"><code>p1 + inset_element(p2, left = unit(1, 'cm'), bottom = unit(30, 'pt'), right = unit(3, 'in'),
                   top = 0.8)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-4-1.png" width="672">
</p>
<p>
The default is to position the inset relative to the panel, but this can be changed with the <code>align_to</code> argument:
</p>
<pre class="r"><code>p1 + inset_element(p2, left = 0.5, bottom = 0.4, right = 1, top = 1, align_to = 'full')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-5-1.png" width="672">
</p>
<p>
When it comes to all other functionality in patchwork, insets behaves as regular plots. This means that they are amenable to change after the composition:
</p>
<pre class="r"><code>p_all &lt;- p1 + inset_element(p2, left = 0.5, bottom = 0.4, right = 1, top = 1) + p3
p_all[[2]] &lt;- p_all[[2]] + theme_classic()
p_all</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
<pre class="r"><code>p_all &amp; theme_dark()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<p>
It can also get tagged automatically:
</p>
<pre class="r"><code>p_all + plot_annotation(tag_levels = 'A')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
<p>
which can be turned off in the same manner as for <code>wrap_elements()</code>:
</p>
<pre class="r"><code>p_all &lt;- p1 + 
  inset_element(p2, left = 0.5, bottom = 0.4, right = 1, top = 1, ignore_tag = TRUE) + 
  p3
p_all + plot_annotation(tag_levels = 'A')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-9-1.png" width="672">
</p>
</section>
<section id="arbitrary-tagging-sequences" class="level2">
<h2 class="anchored" data-anchor-id="arbitrary-tagging-sequences">
Arbitrary tagging sequences
</h2>
<p>
While we’re discussing tagging, patchwork now allows you to provide your own sequence to use, instead of relying on the Latin character, Roman, or Arabic numerals that patchwork understands. This can be used by supplying a list of character vectors to the <code>tag_levels</code> argument instead of a single vector:
</p>
<pre class="r"><code>p_all &lt;- p1 | (p2 / p3)
p_all + plot_annotation(tag_levels = list(c('one', 'two', 'three')))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-10-1.png" width="672">
</p>
<p>
When working with multiple tagging levels, custom sequences can be mixed with the automatic ones:
</p>
<pre class="r"><code>p_all[[2]] &lt;- p_all[[2]] + plot_layout(tag_level = 'new')
p_all + plot_annotation(tag_levels = list(c('one', 'two', 'three'), 'a'), tag_sep = '-')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-11-1.png" width="672">
</p>
</section>
<section id="raster-support" class="level2">
<h2 class="anchored" data-anchor-id="raster-support">
Raster support
</h2>
<p>
While patchwork was designed with ggplot2 in mind it has always supported additional graphic types such as grobs and base graphics (by using formula notation). This release adds support for an additional type: raster. The raster class (and nativeRaster class) are bitmap representation of images and they are now recognized directly and with the <code>wrap_elements()</code> function:
</p>
<pre class="r"><code>logo &lt;- system.file('help', 'figures', 'logo.png', package = 'patchwork')
logo &lt;- png::readPNG(logo, native = TRUE)

p1 + logo</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-12-1.png" width="672">
</p>
<p>
Since they are implemented as wrapped elements they can still be titled etc:
</p>
<pre class="r"><code>p1 + logo + ggtitle('Made with this:') + theme(plot.background = element_rect('grey'))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-13-1.png" width="672">
</p>
<p>
They can of course also be used with the new inset feature to easily add watermarks etc.
</p>
<pre class="r"><code>p1 + inset_element(logo, 0.9, 0.8, 1, 1, align_to = 'full') + theme_void()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-11-09-insetting-a-new-patchwork-version_files/figure-html/unnamed-chunk-14-1.png" width="672">
</p>
</section>
<section id="the-future" class="level2">
<h2 class="anchored" data-anchor-id="the-future">
The future
</h2>
<p>
That’s it for this release. There are no shortage of feature requests for patchwork and I’ll not make any promises, but I hope the next release will focus on adding support for gganimate as well as improvements to the annotation feature so that global axis labels can be added as well and annotations are kept in nested plots.
</p>
<p>
Stay safe!
</p>
</section>



 ]]></description>
  <category>package</category>
  <category>announcement</category>
  <category>patchwork</category>
  <category>visualization</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2020-11-09-insetting-a-new-patchwork-version/</guid>
  <pubDate>Mon, 09 Nov 2020 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/patchwork_logo.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>A noisy start</title>
  <link>https://tutor-church-15580.netlify.app/posts/2020-03-18-a-noisy-start/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/ambient_logo_small.png" align="right" style="width:50%;max-width:200px;margin-left:5pt">
</p>
<p>
I was sure I had released this… Honestly, I thought the new version of ambient had landed on CRAN a year ago. What does that say about me as a developer? Probably not something very positive. One reason is probably that ambient is one of my smaller packages mostly made for myself. It generates noise patterns which is something I use extensively in my <a href="https://www.data-imaginist.com/art">generative art</a>. And the version of ambient I’m now announcing has been available on my own computer for a long time, so I haven’t noticed the lack of a real CRAN release.
</p>
<section id="what-is-noise" class="level2">
<h2 class="anchored" data-anchor-id="what-is-noise">
What is noise
</h2>
<p>
Anyway, what is this package really about? It is a package that facilitates the generation of multidimensional noise of different kinds. Noise should not be equated with completely random values, R has extensive support for generating these through the different distribution sampling functions. The noise that ambient is capable of producing are random, but spatially correlated noise patterns… what on earth is that? Let’s have a look!
</p>
<pre class="r"><code>library(ambient)
library(dplyr)

image(noise_perlin(dim = c(300, 400)))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-03-18-a-noisy-start_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<p>
We see in the above example that the pattern is sort of random, but it remains structured so the value at each point is highly correlated to its neighbors. While we have looked at a 2D example, this principle can be expanded to 3 or even 4 dimensions.
</p>
<p>
The example above used the old interface which is already available on CRAN. That interface simply returns matrices or arrays with the x and y (and z and t) values corresponding to the indices of each cell. This is fast, but super limiting, and the new and promoted interface that you’ll see in a second adds much more control and power.
</p>
</section>
<section id="a-new-api" class="level2">
<h2 class="anchored" data-anchor-id="a-new-api">
A new API
</h2>
<p>
The limitation of the old API was mainly that you were bound to only retrieve values at integer coordinates. This in turn limited the amount of weird operations you might want to do to the coordinates before using them to calculate a noise value. Further, it simply felt clunky and didn’t fit in very well with any type of function composition.
</p>
<p>
The new API (the old still exists) is centered around a long-format grid representation that you create with <code>long_grid()</code>. It basically creates an adorned data frame with coordinates for each row, but provides additional functionality for converting back to matrix/arrays and raster object:
</p>
<pre class="r"><code>grid &lt;- long_grid(x = seq(0, 1, length.out = 1000),
                  y = seq(0, 1, length.out = 1000))

grid</code></pre>
<pre><code>## # A tibble: 1,000,000 x 2
##        x       y
##    &lt;dbl&gt;   &lt;dbl&gt;
##  1     0 0      
##  2     0 0.00100
##  3     0 0.00200
##  4     0 0.00300
##  5     0 0.00400
##  6     0 0.00501
##  7     0 0.00601
##  8     0 0.00701
##  9     0 0.00801
## 10     0 0.00901
## # … with 999,990 more rows</code></pre>
<p>
You can create higher dimensions by simply providing <code>z</code> and <code>t</code> arguments to <code>long_grid()</code> as well. This is all kind of boring of course since we haven’t added any noise yet (which is kinda the point of all this). Don’t worry - it will come.
</p>
</section>
<section id="the-generators" class="level2">
<h2 class="anchored" data-anchor-id="the-generators">
The generators
</h2>
<p>
There are many different types of noise that can be generated with ambient. Perlin noise is perhaps the most well-known (it did land the creator an Oscar after all), but many other exists with different characteristics. All of these can be sampled with the new family of <code>gen_*()</code> functions (generator functions). These all take coordinates along with different other arguments such as e.g.&nbsp;<code>frequency</code> and <code>seed</code>. As an example lets calculate some worley noise:
</p>
<pre class="r"><code>grid &lt;- grid %&gt;% 
  mutate(
    noise = gen_worley(x, y, frequency = 5, value = 'distance')
  )
grid</code></pre>
<pre><code>## # A tibble: 1,000,000 x 3
##        x       y noise
##    &lt;dbl&gt;   &lt;dbl&gt; &lt;dbl&gt;
##  1     0 0       0.203
##  2     0 0.00100 0.207
##  3     0 0.00200 0.211
##  4     0 0.00300 0.215
##  5     0 0.00400 0.219
##  6     0 0.00501 0.223
##  7     0 0.00601 0.228
##  8     0 0.00701 0.232
##  9     0 0.00801 0.236
## 10     0 0.00901 0.241
## # … with 999,990 more rows</code></pre>
<p>
We have now created a new column with the respective worley noise value for each cell. It is usually easier to understand by looking at it:
</p>
<pre class="r"><code>grid %&gt;% 
  plot(noise)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-03-18-a-noisy-start_files/figure-html/unnamed-chunk-5-1.png" width="672">
</p>
<p>
We see that the <code>as.raster()</code> method takes an expression that defines what value should be used for the raster. We normalize it so that it lies between 0 and 1 (a requirement of the raster class) and then use the plot method provided for the raster class.
</p>
<p>
There are a bunch of these <code>gen_<em>()</em></code> functions. Further, there are also a bunch of <code>gen_()</code> functions for creating non-noise patterns, e.g.
</p>
<pre class="r"><code>grid %&gt;% 
  mutate(
    pattern = gen_waves(x, y, frequency = 5)
  ) %&gt;%  
  plot(pattern)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-03-18-a-noisy-start_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
<p>
You may feel at this point that the old interface was much nicer, but the great thing about the generators is that they don’t care about whether the coordinates you feed into it lie in a grid. This means that they can be used to directly look up noise values for particles in a simulation, or modify the grid coordinates before they are passed into the generator. The latter is what is known as noise perturbation and was only available in a very limited form in the old API.
</p>
<pre class="r"><code>grid %&gt;% 
  mutate(
    pertube = gen_simplex(x, y, frequency = 5) / 10,
    noise = gen_worley(x + pertube, y + pertube, value = 'distance', frequency = 5)
  ) %&gt;% 
  plot(noise)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-03-18-a-noisy-start_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<p>
Funky, right? Just to explain what is really going on, each cell in the grid gets a simplex based value, which it then uses to offset its own coordinates before looking up its worley noise value. As simplex noise has a smooth gradient we get these waves distortions of the worley noise.
</p>
</section>
<section id="fractured-noise" class="level2">
<h2 class="anchored" data-anchor-id="fractured-noise">
Fractured noise
</h2>
<p>
The output of e.g.&nbsp;<code>gen_perlin()</code> does not look like what you’d expect if you are used to working with perlin noise (I’d guess). This is because perlin noise is most often used in its fractal form. Fractal noise simply means calculating multiple values for each coordinates at different frequencies and somehow combining them. The most well known is <em>fractal brownian motion</em> (fbm) that simply adds each value together with decreasing intensity, but any combination scheme is possible and ambient comes with a few. To create fractal noise with the new interface we use the <code>fracture()</code> method and pass in a generator and a fractal function along with the different arguments to it:
</p>
<pre class="r"><code># Classic perlin noise (combining 4 different frequencies)
grid %&gt;% 
  mutate(
    noise = fracture(gen_perlin, fbm, octaves = 4, x = x, y = y, freq_init = 5)
  ) %&gt;% 
  plot(noise)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-03-18-a-noisy-start_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
<p>
ambient comes with a handful of different fractal function and you can create your own as well
</p>
<pre class="r"><code># clamp noise before adding them together
grid %&gt;% 
  mutate(
    noise = fracture(gen_perlin, clamped, octaves = 4, x = x, y = y, freq_init = 5)
  ) %&gt;% 
  plot(noise)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-03-18-a-noisy-start_files/figure-html/unnamed-chunk-9-1.png" width="672">
</p>
<hr>
<p>
There are a few other functions as part of this release for e.g.&nbsp;blending values together and calculating derived values from noise fields (e.g.&nbsp;curl and gradient). I will let it be up to you to explore these at your own accord.
</p>
</section>



 ]]></description>
  <category>package</category>
  <category>announcement</category>
  <category>ambient</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2020-03-18-a-noisy-start/</guid>
  <pubDate>Wed, 18 Mar 2020 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/ambient_logo.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>Vectorising like a (semi)pro</title>
  <link>https://tutor-church-15580.netlify.app/posts/2020-03-15-vectorizing-like-a-semi-pro/</link>
  <description><![CDATA[ 




<blockquote class="blockquote">
<p>
This is a short practical post about programming with R. Take it for what it is and nothing more…
</p>
</blockquote>
<p>
R is slow! That is what they keep telling us (<em>they</em> being someone who “knows” about “real” programming and has another language that they for some reason fail to be critical about).
</p>
<p>
R is a weird thing. Especially for people who has been trained in a classical programming language. One of the main reasons for this is its vectorised nature, which is not just about the fact that vectors are prevalent in the language, but is an underlying principle that should guide the design of efficient algorithms in the language. IF you write R like you write C (or Python), then sure it is slow, but really, you are just using it wrong.
</p>
<p>
This post will take you through the design of a vectorised function. The genesis of the function comes from my generative art, but I thought it was so nice and self-contained that it would make a good blog post. If that seems like something that could take your mind off the pandemic, then buckle up!
</p>
<section id="the-problem" class="level2">
<h2 class="anchored" data-anchor-id="the-problem">
The problem
</h2>
<p>
I have a height-map, that is, a matrix of numeric values. You know what? Let’s make this concrete and create one:
</p>
<pre class="r"><code>library(ambient)
library(dplyr)

z &lt;- long_grid(1:100, 1:100) %&gt;% 
  mutate(val = gen_simplex(x, y, frequency = 0.02)) %&gt;% 
  as.matrix(val)

image(z, useRaster = TRUE)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-03-15-vectorizing-like-a-semi-pro_files/figure-html/unnamed-chunk-1-1.png" width="672">
</p>
<p>
This is just some simplex noise of course, but it fits our purpose…
</p>
<p>
Anyway, we have a height-map and we want to find the local extrema, that is, the local minimum and maximum. That’s it. Quite a simple and understandable challenge right.
</p>
</section>
<section id="vectorised-smecktorised" class="level2">
<h2 class="anchored" data-anchor-id="vectorised-smecktorised">
Vectorised, smecktorised
</h2>
<p>
Now, had you been a trained C-programmer you would probably have solved this with a loop. This is the way it should be done in C, but applying this to R will result in a very annoyed programmer who will tell anyone who cares to listen that R is slow.
</p>
<p>
We already knew this. We want something vectorised, right? But what is vectorised anyway? All over the internet the recommendation is to use the <code>apply()</code>-family of function to vectorise your code, but I have some bad news for you: This is the absolute wrong way to vectorise. There are a lot of good reasons to use the functional approach to looping instead of the for-loop, but when it comes to R, performance is not one of them.
</p>
<p>
Shit…
</p>
<p>
To figure this out, we need to be a bit more clear about what we mean with a vectorised function. There are some different ways to think about it
</p>
<ol style="list-style-type: decimal">
<li>
The broad and lazy definition is a function that operates on the elements of a vector. This is where <code>apply()</code> (and friends) based functions reside.
</li>
<li>
The narrow and performant definition is a function that operates on the elements of a vector <em>in compiled code</em>. This is where many of R’s base functions live along with properly designed functions implemented in C or C++
</li>
<li>
The middle ground is a function that is composed of calls to <em>2.</em> to avoid explicit loops, thus deferring most element-wise operations to compiled code.
</li>
</ol>
<p>
We want to talk about <em>3.</em>. Simply implementing this in compiled code would be cheating, and we wouldn’t learn anything.
</p>
</section>
<section id="thinking-with-vectors" class="level2">
<h2 class="anchored" data-anchor-id="thinking-with-vectors">
Thinking with vectors
</h2>
<p>
R comes with a lot of batteries included. Some of the more high-level function are not implemented with performance in mind (sadly), but a lot of the basic stuff is, e.g.&nbsp;indexing, arithmetic, summations, etc. It turns out that these are often enough to implement pretty complex functions in an efficient vectorised manner.
</p>
<p>
Going back to our initial problem of finding extrema: What we effectively are asking for is a moving window function where each cell is evaluated on whether it is the largest or smallest value in its respective window. If you think a bit about this, this is mainly an issue of indexing. For each element in the matrix, we want the indices of all the cells within its window. Once we have that, it is pretty easy to extract all the relevant values and use the vectorised <code>pmin()</code> and <code>pmax()</code> function to figure out the maximum value in the window and use the (vectorised) <code>==</code> to see if the extrema is equivalent to the value of the cell.
</p>
<p>
That’s a lot of talk, here is the final function:
</p>
<pre class="r"><code>extrema &lt;- function(z, neighbors = 2) {
  ind &lt;- seq_along(z)
  rows &lt;- row(z)
  cols &lt;- col(z)
  n_rows &lt;- nrow(z)
  n_cols &lt;- ncol(z)
  window_offsets &lt;- seq(-neighbors, neighbors)
  window &lt;- outer(window_offsets, window_offsets * n_rows, `+`)
  window_row &lt;- rep(window_offsets, length(window_offsets))
  window_col &lt;- rep(window_offsets, each = length(window_offsets))
  windows &lt;- mapply(function(i, row, col) {
    row &lt;- rows + row
    col &lt;- cols + col
    new_ind &lt;- ind + i
    new_ind[row &lt; 1 | row &gt; n_rows | col &lt; 1 | col &gt; n_cols] &lt;- NA
    z[new_ind]
  }, i = window, row = window_row, col = window_col, SIMPLIFY = FALSE)
  windows &lt;- c(windows, list(na.rm = TRUE))
  minima &lt;- do.call(pmin, windows) == z
  maxima &lt;- do.call(pmax, windows) == z
  extremes &lt;- matrix(0, ncol = n_cols, nrow = n_rows)
  extremes[minima] &lt;- -1
  extremes[maxima] &lt;- 1
  extremes
}</code></pre>
<p>
(don’t worry, we’ll go through it in a bit)
</p>
<p>
This function takes a matrix, and a neighborhood radius and returns a new matrix of the same dimensions as the input, with <code>1</code> in the local maxima, <code>-1</code> in the local minima, and <code>0</code> everywhere else.
</p>
<p>
Let’s go through it:
</p>
<pre class="r"><code># ...
  ind &lt;- seq_along(z)
  rows &lt;- row(z)
  cols &lt;- col(z)
  n_rows &lt;- nrow(z)
  n_cols &lt;- ncol(z)
# ...</code></pre>
<p>
Here we are simply doing some quick calculations upfront for reuse later. The ind variable is simply the index for each cell in the matrix. Matrices are simply vectors underneath, so they can be indexed like that as well. <code>rows</code> and <code>cols</code> holds the row and column index of each cell, and <code>n_rows</code> and <code>n_cols</code> are pretty self-explanatory.
</p>
<pre class="r"><code># ...
  window_offsets &lt;- seq(-neighbors, neighbors)
  window &lt;- outer(window_offsets, window_offsets * n_rows, `+`)
  window_row &lt;- rep(window_offsets, length(window_offsets))
  window_col &lt;- rep(window_offsets, each = length(window_offsets))
# ...</code></pre>
<p>
Most of the magic happens here, but it is not that apparent. What we do is that we use the <code>outer()</code> function to construct a matrix, the size of our window, holding the index offset from the center for each of the cells in the window. We also construct vectors holding the rows and column offset for each cell
</p>
<pre class="r"><code># ...
  windows &lt;- mapply(function(i, row, col) {
    row &lt;- rows + row
    col &lt;- cols + col
    new_ind &lt;- ind + i
    new_ind[row &lt; 1 | row &gt; n_rows | col &lt; 1 | col &gt; n_cols] &lt;- NA
    z[new_ind]
  }, i = window, row = window_row, col = window_col, SIMPLIFY = FALSE)
# ...</code></pre>
<p>
This is where all the magic appear to happen. For each cell in the window, we are calculating it’s respective value for each cell in the input matrix. I can already hear you scream about me using and <code>apply()</code>-like function, but the key thing is that I’m not using it to loop over the elements of the input vector (or matrix), but over a much smaller (and often fixed) number of elements.
</p>
<p>
If you want to leave now because I’m moving the goal-posts by my guest.
</p>
<p>
Anyway, what is happening inside the <code>mapply()</code> call? Inside the function we figure out which row and column the offsetted cell is part of. Then we calculate the index of the cells for the offset. In order to guard against out-of-bounds errors we set all the indices that are out of bound to <code>NA</code>, and then we simply index into our matrix. The crucial part is that all of the operations here are vectorised (indexing, arithmetic, and comparisons). In the end we get a list holding vectors of values for each cell in the window.
</p>
<pre class="r"><code># ..
  windows &lt;- c(windows, list(na.rm = TRUE))
  minima &lt;- do.call(pmin, windows) == z
  maxima &lt;- do.call(pmax, windows) == z
  extremes &lt;- matrix(0, ncol = n_cols, nrow = n_rows)
  extremes[minima] &lt;- -1
  extremes[maxima] &lt;- 1
  extremes
# ..</code></pre>
<p>
This is really just wrapping up, even though the actual computations are happening here. We use <code>pmin()</code> and <code>pmax()</code> to find the maximum and minimum across each window, and compare it to the value in our input matrix (again, all proper vectorised function). In the end we construct a matrix holding <code>0</code>s and use the calculated positions to set <code>1</code> or <code>-1</code> at the location of local extremes.
</p>
</section>
<section id="does-it-work" class="level2">
<h2 class="anchored" data-anchor-id="does-it-work">
Does it work?
</h2>
<p>
I guess that is the million dollar question, closely followed by “is it faster?”. I don’t really care enough to implement a “dumb” vectorisation, so I’ll just put my head on the block with the last question and insist that, yes, it is much faster. You can try to beat me with an <code>apply()</code> based solution and I’ll eat a sticker if you succeed (unless you cheat).
</p>
<p>
As for the first question, let’s have a look
</p>
<pre class="r"><code>extremes &lt;- extrema(z)
extremes[extremes == 0] &lt;- NA

image(z, useRaster = TRUE)
image(extremes, col = c('black', 'white'), add = TRUE)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2020-03-15-vectorizing-like-a-semi-pro_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<p>
Lo and behold, it appears as if we succeeded.
</p>
</section>
<section id="can-vectorisation-save-the-world" class="level2">
<h2 class="anchored" data-anchor-id="can-vectorisation-save-the-world">
Can vectorisation save the world?
</h2>
<p>
No…
</p>
<p>
More to the point, not every problem has a nice vectorised solution. Further, the big downside with proper vectorisation is that it often requires expanding a lot of variables to the size of the input vector. In our case we needed to hold all windows in memory simultaneously, and it does not take too much imagination to think up scenarios where that may make our computer explode. Still, more often than not it is possible to write super performant R code, and usually the crucial part is to figure out how to do some intelligent indexing.
</p>
<p>
If you are still not convinced then read through <a href="https://www.brodieg.com">Brodie Gaslam’s blog</a>. He has a penchant for implementing ridiculously complicated stuff in highly efficient R code. It goes without saying that his posts are often more involved than this, but if you have kept reading until this point, I think you are ready…
</p>
</section>



 ]]></description>
  <category>programming</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2020-03-15-vectorizing-like-a-semi-pro/</guid>
  <pubDate>Sun, 15 Mar 2020 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/logo.png" medium="image" type="image/png" height="144" width="144"/>
</item>
<item>
  <title>Don’t be a Dick</title>
  <link>https://tutor-church-15580.netlify.app/posts/2019-12-13-don-t-be-a-dick/</link>
  <description><![CDATA[ 




<p>As the year reaches its final conclusion I use to take a look back at the year that passed and do some naval gazing… I released this and that, I did a talk or two, etc.</p>
<p>But honestly… I’ve made new releases, I have a great job that allows me to work on open source software and get paid for doing my hobby. Life is good!</p>
<p>The world is shit though…</p>
<section id="a-plea" class="level2">
<h2 class="anchored" data-anchor-id="a-plea">A Plea</h2>
<p>I’ve spend most of my adult life developing software and giving it away for free, no strings attached. I try to be a welcoming and helpful part of the community. Lines has to be drawn though…</p>
<p>If you support fascism, racism, misogyny, or any of the other ugly heads that bigotry has, either openly or indirectly by voting for the likes of Trump or Johnson, I have a plea for you:</p>
<p>don’t use my code.</p>
<p>don’t open issues.</p>
<p>don’t ask for help.</p>
<p>This is not an addendum to any license I provide, nor is it in any way legally binding (I wouldn’t know how to achieve that). This is simply a plea from one person to another. You choose to support movements that runs counter to everything open source software stands for, and the least you could do is to not stand on our shoulders as you fight us.</p>
<p>It goes without saying that this plea can only extend to the code I create in my spare time and release as a private person.</p>
<p>It also goes without saying that, should this plea apply to you, you’ll probably ignore it because you have already cast aside decency. If you do ignore it just know: I actively despise you as a user…</p>
</section>
<section id="to-everyone-else" class="level2">
<h2 class="anchored" data-anchor-id="to-everyone-else">To Everyone Else</h2>
<p>As the world goes to shit I’d like to up my commitment to support those hurt the most by it. Do you need feedback, help, or otherwise, with anything I might be able to chip in on (mostly R and generative art), and are you a minority in any way, I invite you to reach out, and I’ll do what I can.</p>
<p>Merry Christmas</p>
</section>
<section id="addendum-161219" class="level2">
<h2 class="anchored" data-anchor-id="addendum-161219">Addendum 16/12/19</h2>
<p>Thankfully people have mostly reacted in positive to this post. This was kind of expected as I think the R community is by and large on the right side of history. A few people have taken issue with me naming political leaders and their supporters directly, thinking this is about political disagreement. This is not the case.</p>
<p>Bigotry is not politics!</p>
<p>You can be a republican and not support Trump. You can be a tory and not support Johnson and if you do I both applaud you for your conviction and welcome you to my small sphere of R packages. If you are a republican/tory and feel uneasy about the current leadership, but still choose to vote for them, you have put your morale values up for sale. This is entirely your choice. I will call you out on it.</p>
<p>Others have indicated that any type of disagreement should simply not have a bearing in the open source world. First, OS is activistic in its very nature, and second, good job on living a priviliged life if you think this is the first time people take a stand in the R world…</p>


</section>

 ]]></description>
  <guid>https://tutor-church-15580.netlify.app/posts/2019-12-13-don-t-be-a-dick/</guid>
  <pubDate>Fri, 13 Dec 2019 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/logo.png" medium="image" type="image/png" height="144" width="144"/>
</item>
<item>
  <title>Patch it up and send it out</title>
  <link>https://tutor-church-15580.netlify.app/posts/2019-11-28-patch-it-up-and-send-it-out/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/patchwork_logo_small.png" align="right" style="width:50%;max-width:200px;margin-left:5pt">
</p>
<p>
I am super, super thrilled to finally be able to announce that patchwork has been released on CRAN. Patchwork has, without a doubt, been my most popular unreleased package and it is great to finally make it available to everyone.
</p>
<p>
Patchwork is a package for composing plots, i.e.&nbsp;placing multiple plots together in the same figure. It is not the only package that tries to solve this. <code>grid.arrange()</code> from gridExtra, and <code>plot_grid()</code> from <code>cowplot</code> are two popular choices while some will claim that all you need is base graphics and <code>layout()</code> (they would be wrong, though). Do we really need another package for this? I personally feel that patchwork brings enough innovation to the table to justify its existence, but if you are a happy user of <code>cowplot::plot_grid()</code> I’m not here to force you away from that joy.
</p>
<p>
The claim to fame of patchwork is mainly two things: A very intuitive API, and a layout engine that promises to keep your plots aligned no matter how complex a layout you concoct.
</p>
<pre class="r"><code>library(ggplot2)
library(patchwork)

p1 &lt;- ggplot(mpg) + 
  geom_point(aes(hwy, displ))
p2 &lt;- ggplot(mpg) + 
  geom_bar(aes(manufacturer, fill = stat(count))) + 
  coord_flip()

# patchwork allows you to add plots together
p1 + p2</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-28-patch-it-up-and-send-it-out_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<p>
If you find this intriguing, you should at least give patchwork a passing glance. I’ve already written at length about all of its features at its <a href="https://patchwork.data-imaginist.com">webpage</a>, so if you don’t want to entertain my ramblings more than necessary, make haste to the <a href="https://patchwork.data-imaginist.com/articles/patchwork.html">Getting Started</a> guide, or one of the in-depth guides covering:
</p>
<ul>
<li>
<a href="https://patchwork.data-imaginist.com/articles/guides/assembly.html">Assembling Plots</a>
</li>
<li>
<a href="https://patchwork.data-imaginist.com/articles/guides/layout.html">Defining Layouts</a>
</li>
<li>
<a href="https://patchwork.data-imaginist.com/articles/guides/annotation.html">Adding Annotation</a>
</li>
<li>
<a href="https://patchwork.data-imaginist.com/articles/guides/multipage.html">Aligning Across Pages</a>
</li>
</ul>
<section id="the-patch-that-worked" class="level2">
<h2 class="anchored" data-anchor-id="the-patch-that-worked">
The Patch that Worked
</h2>
<p>
If you are still here, I’ll tell you a bit more about the package, and round up with some examples of my favorite features in patchwork. As I described in <a href="https://www.data-imaginist.com/2017/looking-back-on-2017/">my look back at 2017</a> patchwork helped me out of burn-out fueled by increasing maintenance burdens of old packages. At that time I don’t think I expected two years to pass before it got its proper release, but here we are… What I don’t really go into is why I started on the package. The truth is that I was beginning to think about the new gganimate API, but was unsure whether it was possible to add completely foreign objects to ggplots, alter how it behaves, while still allowing normal ggplot2 objects to be added afterwards. I was not prepared to create a POC of gganimate to test it out at this point, so I came up with the idea of trying to allow plots to be added together. The new behavior was that the two plots would be placed beside each other, and the last plot would still be able to receive new ggplot objects. It worked, obviously, and I began to explore this idea a bit more, adding more capabilities. I consciously didn’t advertise this package at all. I was still burned out and didn’t want to do anything for anyone but myself, but someone picked it up from my github and made a moderately viral tweet about it, so it quickly became popular despite my intentions. I often joke that patchwork is my most elaborate tech-demo to date.
</p>
<p>
All that being said, I was in search for a better way to compose plots (I think most R users have cursed about misaligned axes and butchered <code>facet_wrap()</code> into a layout engine) and I now had a blurry vision of a solution, so I had to take it out of tech-demo land, and begin to treat it as a real package. But, along came gganimate and swallowed up all my development time. Further, I had hit a snag in how nested layouts worked that meant backgrounds and other elements were lost. This snag was due to a fundamental part of why patchwork otherwise worked so well, so I was honestly in no rush to get back to fixing it.
</p>
<p>
So patchwork lingered, unreleased…
</p>
<p>
At the start of 2019 I decided that the year should be dedicated to finishing of updates and unreleased packages, and by November only patchwork remained. I was still not feeling super exited about getting back to the aforementioned snag, but I saw no way out so I dived in. After having explored uncharted areas of grid in search of something that could align the layout engine implementation with not removing background etc. I was ready to throw it all out, but I decided to see how hard it would be to simply rewrite a subset of the layout engine. 1 day later I had a solution… There is a morale in there somewhere, I’m sure — feel free to use it.
</p>
</section>
<section id="the-golden-patches" class="level2">
<h2 class="anchored" data-anchor-id="the-golden-patches">
The Golden Patches
</h2>
<p>
I don’t want to repeat what I’ve written about at length in the guides I linked to in the beginning of the post, so instead I’ll end with simply a few of my favorite parts of patchwork. There will be little explanation about the code (again, check out the guides), so consider this a blindfolded tasting menu.
</p>
<pre class="r"><code># A few more plots to play with
p3 &lt;- ggplot(mpg) + 
  geom_smooth(aes(hwy, cty)) + 
  facet_wrap(~year)
p4 &lt;- ggplot(mpg) + 
  geom_tile(aes(factor(cyl), drv, fill = stat(count)), stat = 'bin2d')</code></pre>
<section id="human-centered-api" class="level3">
<h3 class="anchored" data-anchor-id="human-centered-api">
Human-Centered API
</h3>
<p>
Patchwork implements a few API innovations to make plot composition both quick, but also readable: Consider this code
</p>
<pre class="r"><code>(p1 | p2) /
   p3</code></pre>
<p>
It is not too difficult to envision what kind of composition comes out of this and, lo and behold, it does exactly what is expected:
</p>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-28-patch-it-up-and-send-it-out_files/figure-html/unnamed-chunk-5-1.png" width="672">
</p>
<p>
As layout complexity increases, the use of operators get less and less readable. Patchwork allows you to provide a textual representation of the layout instead, which scales much better:
</p>
<pre class="r"><code>layout &lt;- '
ABB
CCD
'
p1 + p2 + p3 + p4 + plot_layout(design = layout)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-28-patch-it-up-and-send-it-out_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
</section>
<section id="capable-auto-tagging" class="level3">
<h3 class="anchored" data-anchor-id="capable-auto-tagging">
Capable auto-tagging
</h3>
<p>
When plot compositions are used in scientific literature, the subplots are often enumerated so they can be referred to in the figure caption and text. While you could do that manually, it is much easier to let patchwork do it for you.
</p>
<pre class="r"><code>patchwork &lt;- (p4 | p2) /
                p1
patchwork + plot_annotation(tag_levels = 'A')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-28-patch-it-up-and-send-it-out_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<p>
If you have a nested layout, as in the above, you can even tell patchwork to create a new tagging level for it:
</p>
<pre class="r"><code>patchwork &lt;- ((p4 | p2) + plot_layout(tag_level = 'new')) /
                 p1
patchwork + plot_annotation(tag_levels = c('A', '1'))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-28-patch-it-up-and-send-it-out_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
</section>
<section id="it-allows-you-to-modify-subplots-all-at-once" class="level3">
<h3 class="anchored" data-anchor-id="it-allows-you-to-modify-subplots-all-at-once">
It allows you to modify subplots all at once
</h3>
<p>
What if want to play around with the theme? Do you begin to change the theme of all of your subplots? No, you use the <code>&amp;</code> operator that allows you to add ggplot elements to all your subplots:
</p>
<pre class="r"><code>patchwork &amp; theme_minimal()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-28-patch-it-up-and-send-it-out_files/figure-html/unnamed-chunk-9-1.png" width="672">
</p>
</section>
</section>
<section id="it-shepherds-the-guides" class="level2">
<h2 class="anchored" data-anchor-id="it-shepherds-the-guides">
It shepherds the guides
</h2>
<p>
Look at the plot above. The guides are annoying, right. Let’s put them together:
</p>
<pre class="r"><code>patchwork + plot_layout(guides = 'collect')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-28-patch-it-up-and-send-it-out_files/figure-html/unnamed-chunk-10-1.png" width="672">
</p>
<p>
That is, visually, better but really we only want a single guide for the fill. patchwork will remove duplicates, but only if they are alike. If we give them the same range, we get what we want:
</p>
<pre class="r"><code>patchwork &lt;- patchwork &amp; scale_fill_continuous(limits = c(0, 60))
patchwork + plot_layout(guides = 'collect')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-28-patch-it-up-and-send-it-out_files/figure-html/unnamed-chunk-11-1.png" width="672">
</p>
<p>
Pretty nice, right?
</p>
</section>
<section id="this-is-not-a-grammar" class="level2">
<h2 class="anchored" data-anchor-id="this-is-not-a-grammar">
This is not a grammar
</h2>
<p>
I’ll finish this post off with something that has been rummaging inside my head for a while, and this is as good a place as any to put it. It seems obvious to call patchwork a grammar of plot composition, after all it expands on ggplot2 which has a grammar of graphics. I think that would be wrong. A grammar is not an API, but a theoretical construct that describes the structure of something in a consistent way. An API can be based on a grammar (as is the case for ggplot2 and dplyr) which will guide its design, or a grammar can be developed in close concert with an API as I tried to do with gganimate. Not everything lends itself well to being described by a grammar, and an API is not necessarily bad if it is not based on one (conversely, it may be bad even if it is). Using operators to combine plots is hardly a reflection of an underlying coherent theory of plot composition, much less a reflection of a grammar. It is still a nice API though.
</p>
<p>
Why do I need to say this? It seems like the programming world has been taken over by grammars and you may feel bad about just solving a problem with a nice API. Don’t feel bad — “grammar” has just been conflated with “cohesive API” lately.
</p>
</section>
<section id="towards-some-new-packages" class="level2">
<h2 class="anchored" data-anchor-id="towards-some-new-packages">
Towards some new packages
</h2>
<p>
As mentioned in the beginning, I set out to mainly finish off stuff in 2019. tidygraph, ggforce, and ggraph has seen some huge updates, and with patchwork finally released I’ve reached my year goal with time to spare. I’ll be looking forward to creating something new again, but hopefully find a good rhythm where I don’t need to take a year off to update forgotten projects.
</p>
</section>



 ]]></description>
  <category>package</category>
  <category>announcement</category>
  <category>patchwork</category>
  <category>ggplot2</category>
  <category>visualization</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2019-11-28-patch-it-up-and-send-it-out/</guid>
  <pubDate>Sun, 01 Dec 2019 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/patchwork_announce_1.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>The Colour of Everything</title>
  <link>https://tutor-church-15580.netlify.app/posts/2019-11-13-the-colour-of-everything/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/farver_logo_small.png" align="right" style="width:50%;max-width:200px;margin-left:5pt">
</p>
<p>
I’m happy to announce that farver 2.0 has landed on CRAN. This is a big release comprising of a rewrite of much of the internals along with a range of new functions and improvements. Read on to find out what this is all about.
</p>
<section id="the-case-for-farver" class="level2">
<h2 class="anchored" data-anchor-id="the-case-for-farver">
The case for farver
</h2>
<p>
The first version of farver really came out of necessity as I identified a major performance bottleneck in gganimate related to converting colours into Lab colour space and back when tweening them. This was a result of <code>grDevices::convertColor()</code> not being meant for use with millions of colour values. I build farver in order to address this very specific need, which in turn made Brodie Gaslam look into speeding up the grDevices function. The bottom line is that, while farver is still the fastest at converting between colour spaces, grDevices is now so fast that I probably wouldn’t have bothered to build farver in the first place had it been like this all along. I find this a prime example of fruitful open source competition and couldn’t be happier that Brodie took it upon him.
</p>
<p>
So why a new shiny version? As part of removing compiled code from scales, we decided to adopt farver for colour interpolation, and the code could use a brush-up. I’ve become much more trained in writing compiled code, and further there were some shortcomings in the original implementation that needed to be addressed if scales (and thus ggplot2) should depend on it. Further, I usually write on larger frameworks and there is a certain joy in picking a niche area that you care about and go ridiculously overboard in tooling without worrying about if it benefits any other than yourself (ambient is another example of such indulgence).
</p>
</section>
<section id="the-new-old" class="level2">
<h2 class="anchored" data-anchor-id="the-new-old">
The new old
</h2>
<p>
The former version of farver was quite limited in functionality. It had two functions: <code>convert_colour()</code> and <code>compare_colour()</code> that did colour space conversion and colour distance calculations respectively. No outward changes has been made to these functions, but internally a lot has happened. The old versions had no input validation, so passing in colours with <code>NA</code>, <code>NaN</code>, <code>Inf</code>, and <code>-Inf</code> would give you some truly weird results back. Further, the input and output was not capped to the range of the given colour space, so you could in theory end up with negative RGB values if you converted from a colour space with a larger gamut than sRGB. Both of these issues has been rectified in the new version. Any non-finite value in any channel will result in <code>NA</code> in all channels in the output (for conversion) or an <code>NA</code> distance (for comparison).
</p>
<pre class="r"><code>library(farver)
colours &lt;- cbind(r = c(0, NA, 255), g = c(55, 165, 20), b = c(-Inf, 120, 200))
colours</code></pre>
<pre><code>##        r   g    b
## [1,]   0  55 -Inf
## [2,]  NA 165  120
## [3,] 255  20  200</code></pre>
<pre class="r"><code>convert_colour(colours, from = 'rgb', to = 'yxy')</code></pre>
<pre><code>##            y1        x        y2
## [1,]       NA       NA        NA
## [2,]       NA       NA        NA
## [3,] 25.93626 0.385264 0.1924651</code></pre>
<p>
Further, input is now capped to the channel range (if any) before conversion, and output is capped again before returning the result. The later means that <code>convert_colour()</code> is only symmetric (ignoring rounding errors) if the colours are within gamut in both colour spaces.
</p>
<pre class="r"><code># Equal output because values are capped between 0 and 255
colours &lt;- cbind(r = c(1000, 255), g = 55, b = 120)
convert_colour(colours, 'rgb', 'lab')</code></pre>
<pre><code>##             l        a        b
## [1,] 57.41976 76.10097 12.44826
## [2,] 57.41976 76.10097 12.44826</code></pre>
<p>
Lastly, a new colour space has been added: CIELch(uv) (in farver <code>hcl</code>) has been added as a cousin of CIELch(ab) (<code>lch</code>). Both are polar transformations, but the former is based on <code>luv</code> values and the latter on <code>lab</code>. Both colour spaces are used interchangeably (though not equivalent), and as the <code>grDevices::hcl()</code> function is based on the <code>luv</code> space it made sense to provide an equivalent in farver.
</p>
</section>
<section id="the-new-new" class="level2">
<h2 class="anchored" data-anchor-id="the-new-new">
The new new
</h2>
<p>
The new functionality mainly revolves around the encoding of colour in text strings. In many programming languages colour can be encoded into strings as <code>#RRGGBB</code> where each channel is given in hexadecimal digits. This is also how colours are passed around in R mostly (R also has a list of recognized colour names that can be given as aliases instead of the hex string - see <code>grDevices::colour()</code> for a list). The encoding is convenient as it allows colours to be encoded into vectors, and thus into data frame columns or arrays, but means that if you need to perform operations on it you’d have to first decode the string into channels, potentially convert it into the required colour space, do the manipulation, convert back to sRGB, and encode it into strings. Encoding and decoding has been supported in grDevices with <code>rgb()</code> and <code>col2rgb()</code> respectively, both of which are pretty fast. <code>col2rgb()</code> has a quirk in that the output has the channels in the rows instead of the columns, contrary to how decoded colours are presented everywhere else:
</p>
<pre class="r"><code>grDevices::col2rgb(c('#56fec2', 'red'))</code></pre>
<pre><code>##       [,1] [,2]
## red     86  255
## green  254    0
## blue   194    0</code></pre>
<p>
farver sports two new functions that, besides providing consistency in the output format also eliminates some steps in the workflow described above:
</p>
<pre class="r"><code># Decode strings with decode_colour
colours &lt;- decode_colour(c('#56fec2', 'red'))
colours</code></pre>
<pre><code>##        r   g   b
## [1,]  86 254 194
## [2,] 255   0   0</code></pre>
<pre class="r"><code># Encode with encode_colour
encode_colour(colours)</code></pre>
<pre><code>## [1] "#56FEC2" "#FF0000"</code></pre>
<p>
Besides the basic use shown above, both function allows input/output from other colour spaces than sRGB. That means that if you need to manipulate some colour in Lab space, you can simply decode directly into that, do the manipulation and encode directly back. The functionality is baked into the compiled code, meaning that a lot of memory allocation is spared, making this substantially faster than a grDevices-based workflow:
</p>
<pre class="r"><code>library(ggplot2)

# Create some random colour strings
colour_strings &lt;- sample(grDevices::colours(), 5000, replace = TRUE)

# Get Lab values from a string
timing &lt;- bench::mark(
  farver = decode_colour(colour_strings, to = 'lab'),
  grDevices = convertColor(t(col2rgb(colour_strings)), 'sRGB', 'Lab', scale.in = 255), 
  check = FALSE,
  min_iterations = 100
)
plot(timing, type = 'ridge') +
  theme_minimal() + 
  labs(x = NULL, y = NULL)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-13-the-colour-of-everything_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<p>
Can we do better than this? If the purpose is simply to manipulate a single channel in a colour encoded as a string, we may forego the encoding and decoding completely and do it all in compiled code. farver provides a family of functions for doing channel manipulation in string encoded colours. The channels can be any channel in any colour space supported by farver, and the decoding, manipulation and encoding is done in one pass. If you have a lot of colours and need to increase e.g.&nbsp;darkness, this can save a lot of memory allocation:
</p>
<pre class="r"><code># a lot of colours
colour_strings &lt;- sample(grDevices::colours(), 500000, replace = TRUE)

darken &lt;- function(colour, by) {
  colour &lt;- t(col2rgb(colour))
  colour &lt;- convertColor(colour, from = 'sRGB', 'Lab', scale.in = 255)
  colour[, 'L'] &lt;- colour[, 'L'] * by
  colour &lt;- convertColor(colour, from = 'Lab', to = 'sRGB')
  rgb(colour)
}
timing &lt;- bench::mark(
  farver = multiply_channel(colour_strings, channel = 'l', value = 1.2, space = 'lab'),
  grDevices = darken(colour_strings, 1.2),
  check = FALSE,
  min_iterations = 100
)
plot(timing, type = 'ridge') + 
  theme_minimal() + 
  labs(x = NULL, y = NULL)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-11-13-the-colour-of-everything_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
</section>
<section id="the-bottom-line" class="level2">
<h2 class="anchored" data-anchor-id="the-bottom-line">
The bottom line
</h2>
<p>
The new release of farver provides invisible improvements to the existing functions and a range of new functionality for working efficiently with string encoded colours. You will be using it indirectly following the next release of scales if you are plotting with ggplot2, but you shouldn’t be able to tell. If you somehow ends up having to manipulate millions of colours, then farver is still the king of the hill by a large margin when it comes to performance, but I personally believe that it also provides a much cleaner API than any of the alternatives.
</p>
</section>



 ]]></description>
  <category>farver</category>
  <category>announcement</category>
  <category>package</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2019-11-13-the-colour-of-everything/</guid>
  <pubDate>Wed, 13 Nov 2019 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/farver_logo.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>1 giraffe, 2 giraffe, GO!</title>
  <link>https://tutor-church-15580.netlify.app/posts/2019-08-22-1-giraffe-2-giraffe-go/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/ggraph_logo_small.png" align="right" style="width:50%;max-width:200px;margin-left:5pt">
</p>
<p>
I am beyond excited to finally be able to announce a new version of ggraph. This release, like the <a href="../2019-03-04-the-ggforce-awakens-again">ggforce 0.3.0 release</a>, has been many years in the making, laying dormant for long periods first waiting for ggplot2 to get updated and then waiting for me to have time to finally finish it off. All that is in the past now as ggraph 2.0.0 has finally landed on CRAN, filled with numerous new features, a massive amount of bug fixes, and a slew of breaking changes.
</p>
<p>
If you are new to ggraph, a short description follows: It is an extension of ggplot2 that implement an extended grammar for relational data (e.g.&nbsp;trees and networks). It provides a huge variety of geoms for drawing nodes and edges, along with an assortment of layouts making it possible to produce a very wide range of network visualization types. It is to my knowledge the most feature packed network visualization framework available in R (and potentially in other languages as well), all building on top of the familiar ggplot2 API. If you want to learn more I invite you to browse the new <a href="https://ggraph.data-imaginist.com/">pkgdown website</a> that has been made available.
</p>
<section id="new-looks" class="level2">
<h2 class="anchored" data-anchor-id="new-looks">
New looks
</h2>
<p>
Before we begin with the exiting new stuff, there’s a small change that may or may not greet you as you make your first new plot with ggraph v2.0.0. The default look of a ggplot is often not a good fit for network visualisations as the positional scales are irrelevant. Because of this ggraph has since its release offered a <code>theme_graph()</code> that removed a lot of the useless clutter such as axes and grid lines. You had to use it deliberately though as I didn’t want to overwrite any defaults you may have had. In the new release I’ve relaxed on this a bit. When you construct a ggraph plot it will still use the default theme as a base, but it will remove axes and gridlines from it. This makes it easier to use it together with coorporate templates and the likes right out the box. You can still use <code>theme_graph()</code>, or potentially set it as a default using <code>set_graph_style()</code> if you so wish.
</p>
<pre class="r"><code>library(ggraph)

# THe new default look:
ggraph(highschool) + 
  geom_edge_link() + 
  geom_node_point()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<pre class="r"><code># Using theme_graph for the remainder of this post
set_graph_style(size = 11, plot_margin = margin(0, 0, 0, 0))</code></pre>
</section>
<section id="the-broken-giraffe" class="level2">
<h2 class="anchored" data-anchor-id="the-broken-giraffe">
The broken giraffe
</h2>
<p>
Let us start proper with what this release breaks, because it does it for some very good reasons and you’ll all be happy about it shortly as you read on. The 1.x.x versions of ggraph worked with two different types of network representations: igraph objects and dendrogram object. Some further types such as hclust and network objects were supported by automatic conversion, but that was it. Further, the internal architecture meant that certain layouts and geoms could only be used with certain objects. This was obviously an imperfect situation and one that reflected that tidygraph was developed after ggraph. In ggraph 2.0.0 the internals have been rewritten to only be based on tidygraph. This means that all layouts and geoms will always be available (as long as the topology supports it). This doesn’t mean that igraph, dendrogram, network, and hclust objects are no longer supported, though. Every input will be attempted to be coerced to a tbl_graph object, and as tidygraph supports a wealth of network representations, ggraph can now be used with an even wider selection of objects, all completely without any need for change from the user.
</p>
<p>
While this change was completely internal and thus didn’t break anything, it did put in to question the API of the <code>ggraph()</code> function, which had been designed before tidy evaluation and tidygraph came into existence. Prior to 2.0.0 all layout arguments passed into <code>ggraph()</code> (and <code>create_layout()</code>) would be passed as strings if they referenced any node or edge property, e.g.
</p>
<pre class="r"><code>library(tidygraph)

graph &lt;- as_tbl_graph(
  data.frame(
    from = sample(5, 20, TRUE),
    to = sample(5, 20, TRUE),
    weight = runif(20)
  )
)</code></pre>
<pre class="r"><code>ggraph(graph, layout = 'fr', weights = "weight") + 
  geom_edge_link() + 
  geom_node_point()</code></pre>
<p>
With the new API, edge and node parameters are passed along as unquoted expressions that will be evaluated in the context of the edge or node data respectively. The example above will this be:
</p>
<pre class="r"><code>ggraph(graph, layout = 'fr', weights = weight) + 
  geom_edge_link() + 
  geom_node_point()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
<p>
This change might seem superficial and unnecessary until you realize that this means the network object doesn’t have to be updated every time you want to try new edge and node parameters for the layout:
</p>
<pre class="r"><code>ggraph(graph, layout = 'fr', weights = sqrt(weight)) + 
  geom_edge_link() + 
  geom_node_point()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<p>
So, that’s the extent of the breakage… Now what does this change allow..?
</p>
</section>
<section id="tidygraph-inside" class="level2">
<h2 class="anchored" data-anchor-id="tidygraph-inside">
Tidygraph inside
</h2>
<p>
The use of tidygraph runs much deeper than simply being used as the internal network representation. ggraph will also register the network object during creation and rendering of the plot, meaning that all tidygraph algorithms are available as input to layout specs and aesthetic mappings:
</p>
<pre class="r"><code>graph &lt;- as_tbl_graph(highschool)

ggraph(graph, layout = 'fr', weights = centrality_edge_betweenness()) + 
  geom_edge_link() + 
  geom_node_point(aes(size = centrality_pagerank(), colour = node_is_center()))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
<p>
It is obvious (at least to me) that this new-found capability will make it much easier to experiment and iterate on the visualization, hopefully inspiring users to try out different settings before settling on a plot.
</p>
<p>
As discussed above, the tidygraph integration also makes it easy to plot a wide variety of data types directly. Above we first create a tbl_graph from the <code>highschool</code> edge-list, but that is strictly not necessary:
</p>
<pre class="r"><code>head(highschool)</code></pre>
<pre><code>##   from to year
## 1    1 14 1957
## 2    1 15 1957
## 3    1 21 1957
## 4    1 54 1957
## 5    1 55 1957
## 6    2 21 1957</code></pre>
<pre class="r"><code>ggraph(highschool, layout = 'kk') + 
  geom_edge_link() + 
  geom_node_point()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-9-1.png" width="672">
</p>
<p>
Note that even though the input is not a tbl_graph it will be converted to one so all the tidygraph algorithms are still available during plotting.
</p>
<p>
To further make it easy to quickly gain an overview over your network data, ggraph gains a <code>qgraph()</code> function that inspects you input and automatically picks a layout and combination of edge and node geoms. While the return type is a standard ggraph/ggplot object it should not really be used as the basis for a more complicated plot as you have no influence over how the layout and first couple of layers are chosen.
</p>
<pre class="r"><code>iris_clust &lt;- hclust(dist(iris[, 1:4]))

qgraph(iris_clust)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-10-1.png" width="672">
</p>
</section>
<section id="layout-galore" class="level2">
<h2 class="anchored" data-anchor-id="layout-galore">
Layout galore
</h2>
<p>
ggraph 2.0.0 comes with a huge selection of new layouts, from new algorithms for the classic node-edge diagram to completely new types such as matrix and (bio)fabric layouts. The biggest addition comes from the integration of the <a href="https://github.com/schochastics/graphlayouts">graphlayouts</a> package by <a href="https://twitter.com/schochastics">David Schoch</a> who has done a tremendous job in bringing new, high quality, layout algorithms to R. The <code>‘stress’</code> layout is the new default as it does a much better job than fruchterman-reingold (<code>‘fr’</code>). It also includes a sparse version <code>‘sparse_stress’</code> for large graphs that are much faster than any of the ones provided by igraph.
</p>
<pre class="r"><code># Defaults to stress, with a message
ggraph(graph) + 
  geom_edge_link() + 
  geom_node_point()</code></pre>
<pre><code>## Using `stress` as default layout</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-11-1.png" width="672">
</p>
<p>
There are other layouts from graphlayouts of interest, e.g.&nbsp;the <code>‘backbone’</code> layout that emphasize community structure, the <code>‘focus’</code> layout that places all nodes in concentric circle based on their distance to a selected node etc. I wont show them all here but instead direct you to its <a href="https://github.com/schochastics/graphlayouts">github page</a> that describes all its different layouts.
</p>
<p>
Another type of layout that has become available is the unrooted equal-angle and equal-daylight algorithms for drawing unrooted trees. This type of trees are different than those resulting from e.g.&nbsp;hierarchical clustering in that they do not contain direction or a specific root node. The tree structure is only given by the branch length. To support this the <code>‘dendrogram’</code> layout has gained a length argument that allows the layout to be calculated from branch length:
</p>
<pre class="r"><code>library(ape)
data(bird.families)
# Using the bird.orders dataset from ape
ggraph(bird.families, 'dendrogram', length = length) + 
  geom_edge_elbow()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-12-1.png" width="672">
</p>
<p>
Often the dendrogram layout is a bad choice for unrooted trees, as it implicitly shows a node as the root and draw everything else according to that. Instead one can choose the <code>‘unrooted’</code> layout where leafs are attempted evenly spread across the plane.
</p>
<pre class="r"><code>ggraph(bird.families, 'unrooted', length = length) + 
  geom_edge_link()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-13-1.png" width="672">
</p>
<p>
By default the equal-daylight algorithm is used but it is possible to also get the simpler, but less well-dispersed equal-angle version as well by setting <code>daylight = FALSE</code>.
</p>
<p>
The new version also brings two new special layouts (special meaning non-standard): <code>‘matrix’</code> and <code>‘fabric’</code>, which, like the <code>‘hive’</code> layout, brings their own edge and node geoms. The matrix layout places nodes on a diagonal and shows edges by placing points at the horizontal and vertical intersection of the terminal nodes. The selling point of this layout is that it scales better as there is no possibility of edge crossings. On the other hand is matrix layouts very dependent on the order in which nodes are placed, and as the network growth so does the possible ordering of nodes. There exist however a large range of node ranking algorithm that can be used to provide an effective ordering and many of these are available in tidygraph. It can take some time getting used to matrix plots but once you begin to recognize patterns in the plot and how it links to certain topological features of the network, they can become quite effective tools:
</p>
<pre class="r"><code># Create a graph where internal edges in communities are grouped
graph &lt;- create_notable('zachary') %&gt;%
  mutate(group = factor(group_infomap())) %&gt;%
  morph(to_split, group) %&gt;%
  activate(edges) %&gt;%
  mutate(edge_group = as.character(.N()$group[1])) %&gt;%
  unmorph()</code></pre>
<pre><code>## Warning: `as_quosure()` requires an explicit environment as of rlang 0.3.0.
## Please supply `env`.
## This warning is displayed once per session.</code></pre>
<pre class="r"><code>ggraph(graph, 'matrix', sort.by = node_rank_hclust()) + 
  geom_edge_point(aes(colour = edge_group), mirror = TRUE) + 
  coord_fixed()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-14-1.png" width="672">
</p>
<p>
As can be seen in the example above it is often useful to mirror edges to both sides of the diagonal to make the patterns stronger. Highly connected nodes are easily recognizable, without suffering from over-plotting, and by choosing an appropriate ranking algorithm communities are easily visible. In addition to <code>gemo_edge_point()</code> ggraph also provides <code>geom_edge_tile()</code> for a different look.
</p>
<p>
The fabric layout (originally called biofabric, but I have decided to drop the prefix to indicate it can be used generally), is another layout approach that tries to deal with the problems of over-plotting. It does so by drawing all edges as evenly spaced vertical lines, and all nodes as evenly spaced horizontal lines. As with the matrix layout it is highly dependent on the sorting of nodes, and requires some getting used to. I urge you to give it a chance though, potentially with some help from the <a href="http://www.biofabric.org">website</a> its inventor has set up:
</p>
<pre class="r"><code>ggraph(graph, 'fabric', sort.by = node_rank_fabric()) + 
  geom_node_range(aes(colour = group), alpha = 0.3) + 
  geom_edge_span(aes(colour = edge_group), end_shape = 'circle') + 
  coord_fixed() + 
  theme(legend.position = 'top')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-15-1.png" width="672">
</p>
<p>
The <code>node_rank_fabric()</code> is the ranking proposed in the original paper, but other ranking algorithms are of course also possible.
</p>
<p>
The last new feature in the layout department is that it is now easier to plug in new layouts. First, by providing a matrix or data.frame to the <code>layout</code> argument in <code>ggraph()</code> you can quickly provide a fixed position of the nodes. The same can be obtained by providing an <code>x</code> and <code>y</code> argument to the <code>‘auto’</code> layout. Second, you can provide a function directly to the <code>layout</code> argument. The function must take a tbl_graph as input and return a data.frame or an object coercible to one. This means that e.g.&nbsp;layouts defined as physics simulations with the particles package can be used directly:
</p>
<pre class="r"><code>library(particles)
# Set up simulation
sim &lt;- . %&gt;% simulate() %&gt;% 
  wield(manybody_force) %&gt;% 
  wield(link_force) %&gt;% 
  evolve()

ggraph(graph, sim) + 
  geom_edge_link(colour = 'grey') + 
  geom_node_point(aes(colour = group), size = 3)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-16-1.png" width="672">
</p>
</section>
<section id="geoms-for-the-people" class="level2">
<h2 class="anchored" data-anchor-id="geoms-for-the-people">
Geoms for the people
</h2>
<p>
While ggraph has always included quite a large range of different geoms for showing nodes and edges, this release has managed to add some more. Most importantly, <code>geom_edge_fan()</code> has gained a brother in crime for showing multi-edges. <code>geom_edge_parallel()</code> will draw edges as straight lines but, in the case of multi-edges, will offset them slightly orthogonal to its direction so that there is no overlap. This is a geom best suited for smaller graphs (IMO), but here it can add a very classic look to the plot:
</p>
<pre class="r"><code>small_graph &lt;- create_notable('bull') %&gt;%
  convert(to_directed) %&gt;%
  bind_edges(data.frame(from = c(1, 2, 5, 3), to = c(2, 1, 3, 2)))

ggraph(small_graph, 'stress') + 
  geom_edge_parallel(end_cap = circle(.5), start_cap = circle(.5),
                     arrow = arrow(length = unit(1, 'mm'), type = 'closed')) + 
  geom_node_point(size = 4)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-17-1.png" width="672">
</p>
<p>
For this edge geom in particular it is often a good idea to use capping to let them end before they reaches the terminal nodes.
</p>
<p>
Another edge geom that has become available is <code>geom_edge_bend()</code> which is sort of an organic elbow geom:
</p>
<pre class="r"><code>ggraph(iris_clust, 'dendrogram', height = height) + 
  geom_edge_bend()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-18-1.png" width="672">
</p>
<p>
Lastly, in addition to the node and edge geoms shown in the Layout section, <code>geom_node_voronoi()</code> has been added. It is a ggraph specific version of <code>ggforce::geom_voronoi_tile()</code> that allows you to create a Voronoi tessellation of the nodes and use the resulting tiles to show the nodes. As with the ggforce version it is possible to constrain the tiles to a specific radius around the edge making it a great way of showing which nodes dominates certain areas without any problems with over-plotting.
</p>
<pre class="r"><code>ggraph(graph, 'stress') + 
  geom_node_voronoi(aes(fill = group), max.radius = 0.5, colour = 'white') + 
  geom_edge_link() + 
  geom_node_point()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-19-1.png" width="672">
</p>
<p>
A last little thing pertaining to edge geoms is that many have gained a <code>strength</code> argument, which controls their level of non-linearity (this is obviously only available for non-linear edges). Setting <code>strength = 0</code> will result in a linear edge, while setting <code>strength = 1</code> will give the standard look. Everything in between is fair game, while everything outside that range will look exceptionally weird, probably.
</p>
<pre class="r"><code>ggraph(iris_clust, 'dendrogram', height = height) + 
  geom_edge_bend(alpha = 0.3) + 
  geom_edge_bend(strength = 0.5, alpha = 0.3) + 
  geom_edge_bend(strength = 0.2, alpha = 0.3)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-20-1.png" width="672">
</p>
<pre class="r"><code>ggraph(iris_clust, 'dendrogram', height = height) + 
  geom_edge_elbow(alpha = 0.3) + 
  geom_edge_elbow(strength = 0.5, alpha = 0.3) + 
  geom_edge_elbow(strength = 0.2, alpha = 0.3)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-21-1.png" width="672">
</p>
<p>
A few geoms have had arguments such as <code>curvature</code> or <code>spread</code> that have had similar purpose, but those arguments have been deprecated in favor of the same argument across all (applicable) geoms.
</p>
<p>
And then one more last thing, but it is really not something new in ggraph. As you can use standard geoms for drawing nodes some of the new features in ggforce is of particular interest to ggraph users. The <code>geom_mark_*()</code> family in particular is great for annotating single, or groups of nodes, and going forward it will be the advised approach:
</p>
<pre class="r"><code>library(ggforce)
ggraph(graph, 'stress') + 
  geom_edge_link() + 
  geom_node_point() + 
  geom_mark_ellipse(aes(x, y, label = 'Group 3', 
                        description = 'A very special collection of nodes',
                        filter = group == 3))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-22-1-giraffe-2-giraffe-go_files/figure-html/unnamed-chunk-22-1.png" width="672">
</p>
</section>
<section id="all-the-rest" class="level2">
<h2 class="anchored" data-anchor-id="all-the-rest">
All the rest
</h2>
<p>
These are the exiting new stuff, but the release also includes numerous bug fixes and small tweaks… Far to many to be interesting to list, so you must take my work for it 😄.
</p>
<p>
As with ggforce I hope that ggraph never goes this long without a release again. Feel free to flood me with feature request after you have played with the new version and I’ll do my best to take them on.
</p>
<p>
I’ll spend some time on ggplot2 and grid for now, but still plan on taking a development sprint with patchwork with the intend of getting it on CRAN before the end of this year.
</p>
</section>



 ]]></description>
  <category>ggraph</category>
  <category>package</category>
  <category>announcement</category>
  <category>visualization</category>
  <category>network</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2019-08-22-1-giraffe-2-giraffe-go/</guid>
  <pubDate>Mon, 02 Sep 2019 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/ggraph_announce_2.jpg" medium="image" type="image/jpeg"/>
</item>
<item>
  <title>A Flurry of Facets</title>
  <link>https://tutor-church-15580.netlify.app/posts/2019-08-08-a-flurry-of-facets/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/ggforce_logo_small.png" align="right" style="width:50%;max-width:200px;margin-left:5pt">
</p>
<p>
When I <a href="../2019-03-04-the-ggforce-awakens-again">announced the last release of ggforce</a> I hinted that I would like to transition to a more piecemeal release habit and avoid those monster releases that the last one was. True to my word, I am now thrilled to announce that a new version of ggforce is available on CRAN for your general consumption. It goes without saying that this release contains fewer features and fixes than the last one, but those it packs are considerable so let’s get to it.
</p>
<section id="build-for-gganimate" class="level2">
<h2 class="anchored" data-anchor-id="build-for-gganimate">
Build for gganimate
</h2>
<p>
The <a href="https://gganimate.com">gganimate</a> package facilitates the creation of animations from ggplot2 plots. It is build to be as general purpose as possible, but it still makes a few assumptions about how the layers in the plot behaves. Some of these assumptions where not met in a few of the ggforce geoms (the technical explanation was that some stats and geoms stripped group information from the data which trips up gganimate). This has been rectified in the new version of ggforce and all geoms should now be ready for use with gganimate (please report back if you run into any problems).
</p>
</section>
<section id="facets-for-the-people" class="level2">
<h2 class="anchored" data-anchor-id="facets-for-the-people">
Facets for the people
</h2>
<p>
The remainder of the release centers around facets and a few geoms that has been made specifically for them.
</p>
<section id="enter-the-matrix" class="level3">
<h3 class="anchored" data-anchor-id="enter-the-matrix">
Enter the matrix
</h3>
<p>
The biggest news is undoubtedly the introduction of <code>facet_matrix()</code>, a facet that allows you to create a grid of panels with different data columns in the different rows and columns of the grid. Examples of such arrangements are known as scatterplot matrices and pairs plots, but these are just a subset of the general approach.
</p>
<p>
Before we go on I will, in the interest of full disclosure, mention that certain types of scatterplot matrices have been possible for a long time. Most powerful has perhaps been the <a href="https://ggobi.github.io/ggally/#ggallyggpairs"><code>ggpairs()</code> function in GGally</a> that provides an API for pairs plots build on top of ggplot2. More low-level and limited has been the possibility of converting the data to a long format by stacking the columns of interest and using <code>facet_grid()</code>. The latter approach requires that all columns of interest are of the same type and further moves a crucial operation of the visualization out of the visualization API. The former approach, while powerful, is a wrapper around ggplot2 rather than an extension of the API. This means that you are limited to what the wrapper function provides thus loosing the flexibility of the ggplot2 API. A plurality of choices is good though, and I’m certain that there are rooms for all approaches to thrive.
</p>
<p>
To show off <code>facet_matrix()</code> I’ll start with a standard use of scatterplot matrices, namely plotting multiple components from a PCA analysis against each other.
</p>
<pre class="r"><code>library(recipes)
# Data described here: https://bookdown.org/max/FES/chicago-intro.html 
load(url("https://github.com/topepo/FES/blob/master/Data_Sets/Chicago_trains/chicago.RData?raw=true"))

pca_on_stations &lt;- 
  recipe(~ ., data = training %&gt;% select(starts_with("l14_"))) %&gt;% 
  step_center(all_predictors()) %&gt;% 
  step_scale(all_predictors()) %&gt;%
  step_pca(all_predictors(), num_comp = 5) %&gt;% 
  prep() %&gt;% 
  juice()

pca_on_stations</code></pre>
<pre><code>## # A tibble: 5,698 x 5
##       PC1   PC2     PC3     PC4   PC5
##     &lt;dbl&gt; &lt;dbl&gt;   &lt;dbl&gt;   &lt;dbl&gt; &lt;dbl&gt;
##  1   1.37 4.41   0.347   0.150  0.631
##  2   1.86 4.50   0.618   0.161  0.523
##  3   2.03 4.50   0.569   0.0468 0.543
##  4   2.37 4.43   0.498  -0.209  0.559
##  5   2.37 4.13   0.422  -0.745  0.482
##  6 -15.7  1.23   0.0164 -0.180  1.04 
##  7 -21.2  0.771 -0.653   1.35   1.23 
##  8  -8.45 2.36   1.07   -0.143  0.404
##  9   3.04 4.30   0.555  -0.0476 0.548
## 10   2.98 4.45   0.409  -0.125  0.677
## # … with 5,688 more rows</code></pre>
<pre class="r"><code>library(ggforce)

ggplot(pca_on_stations, aes(x = .panel_x, y = .panel_y)) + 
  geom_point(alpha = 0.2, shape = 16, size = 0.5) + 
  facet_matrix(vars(everything()))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-3-1.png" width="672">
</p>
<p>
Let’s walk through that last piece of code. We construct a standard ggplot using <code>geom_point()</code> but we map x and y to <code>.panel_x</code> and <code>.panel_y</code>. These are placeholders created by <code>facet_matrix()</code>. Lastly we add the <code>facet_matrix()</code> specification. At a minimum we’ll need to specify which columns to use. For that we can use standard tidyselect syntax as known from e.g.&nbsp;<code>dplyr::select()</code> (here we use <code>everything()</code> to select all columns).
</p>
<p>
Now, the above plot has some obvious shortcomings. The diagonal is pretty useless for starters, and it is often that these panels are used to plot the distributions of the individual variables. Using e.g.&nbsp;<code>geom_density()</code> won’t work as it always start at 0, thus messing with the y-scale of each row. ggforce provides two new geoms tailored for the diagonal: <code>geom_autodensity()</code> and <code>geom_autohistogram()</code> which automatically positions itself inside the panel without affecting the y-scale. We’d still need to have this geom only in the diagonal, but <code>facet_matrix()</code> provides exactly this sort of control:
</p>
<pre class="r"><code>ggplot(pca_on_stations, aes(x = .panel_x, y = .panel_y)) + 
  geom_point(alpha = 0.2, shape = 16, size = 0.5) + 
  geom_autodensity() +
  facet_matrix(vars(everything()), layer.diag = 2)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-4-1.png" width="672">
</p>
<p>
As the y-scale no longer affects the diagonal we’ll emphasize this by removing the horizontal grid lines there:
</p>
<pre class="r"><code>ggplot(pca_on_stations, aes(x = .panel_x, y = .panel_y)) + 
  geom_point(alpha = 0.2, shape = 16, size = 0.5) + 
  geom_autodensity() +
  facet_matrix(vars(everything()), layer.diag = 2, grid.y.diag = FALSE)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-5-1.png" width="672">
</p>
<p>
There is still some redundancy left. As the grid is symmetrical the upper and lower triangle shows basically the same (with flipped axes). We could add some insight by using another geom in one of the areas that showed some summary statistic instead:
</p>
<pre class="r"><code>ggplot(pca_on_stations, aes(x = .panel_x, y = .panel_y)) + 
  geom_point(alpha = 0.2, shape = 16, size = 0.5) + 
  geom_autodensity() +
  geom_density2d() +
  facet_matrix(vars(everything()), layer.diag = 2, layer.upper = 3, 
               grid.y.diag = FALSE)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
<p>
While we could call this a day and be pretty pleased with ourselves, I’ll need to show the final party trick of <code>facet_matrix()</code>. The above example was kind of easy because all the variables were continuous. What if we had a mix?
</p>
<pre class="r"><code>ggplot(mpg, aes(x = .panel_x, y = .panel_y)) + 
  geom_point(shape = 16, size = 0.5) + 
  facet_matrix(vars(fl, displ, hwy))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<p>
As we can see <code>facet_matrix()</code> itself handles the mix of scale types quite well, but <code>geom_point()</code> is not that telling when used on a mix of continuous and discrete position scales. ggforce handles this by providing a new position adjustment (<code>position_auto()</code>) that jitters the data based on the scale types. For continuous vs discrete it does a sina-like jitter, whereas for discrete vs discrete it jitters inside a disc (continuous vs continuous makes no jitter):
</p>
<pre class="r"><code>ggplot(mpg, aes(x = .panel_x, y = .panel_y)) + 
  geom_point(shape = 16, size = 0.5, position = 'auto') + 
  facet_matrix(vars(fl, displ, hwy))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
<p>
<code>geom_autodensity()</code> and <code>geom_autohistogram()</code> also knows how to handle both discrete and continuous data, so these can be used safely in all circumstances (here also showing that you can of course also map other aesthetics):
</p>
<pre class="r"><code>ggplot(mpg, aes(x = .panel_x, y = .panel_y, fill = drv, colour = drv)) + 
  geom_point(shape = 16, size = 0.5, position = 'auto') + 
  geom_autodensity(alpha = 0.3, colour = NA, position = 'identity') + 
  facet_matrix(vars(fl, displ, hwy), layer.diag = 2)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-9-1.png" width="672">
</p>
<p>
Lastly, if you need to use a geom that only makes sense with a specific combination of scales, you can pick these layers directly, though you may end up fiddling a bit to get all the right layers where you want them:
</p>
<pre class="r"><code>ggplot(mpg, aes(x = .panel_x, y = .panel_y, fill = drv, colour = drv)) + 
  geom_point(shape = 16, size = 0.5, position = 'auto') + 
  geom_autodensity(alpha = 0.3, colour = NA, position = 'identity') + 
  geom_smooth(aes(colour = NULL, fill = NULL)) + 
  facet_matrix(vars(fl, displ, hwy), layer.diag = 2, layer.continuous = TRUE,
               layer.mixed = -3, layer.discrete = -3)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-10-1.png" width="672">
</p>
<p>
The last example I’m going to show, is simply that you don’t have to create symmetric grids. By default <code>facet_matrix()</code> sets the column selection to be the same as the row selection, but you can overwrite that:
</p>
<pre class="r"><code>ggplot(mpg, aes(x = .panel_x, y = .panel_y)) + 
  geom_point(shape = 16, size = 0.5, position = 'auto') + 
  facet_matrix(vars(manufacturer, hwy), vars(drv, cty))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-11-1.png" width="672">
</p>
<p>
As you can hopefully appreciate, <code>facet_matrix()</code> is maximally flexible, while keeping the API of the standard use cases relatively clean. The lack of a ggplot2-like API for plotting different variables against each others in a grid has been a major annoyance for me, and I’m very pleased with how I finally solved it—I hope you’ll put it to good use as well.
</p>
</section>
<section id="who-needs-two-dimensions-anyway" class="level3">
<h3 class="anchored" data-anchor-id="who-needs-two-dimensions-anyway">
Who needs two dimensions anyway?
</h3>
<p>
The last new pack of facets are more benign, but something repeatedly requested. <code>facet_row()</code> and it’s cousin <code>facet_col()</code> are one-dimensional mixes of <code>facet_grid()</code> and <code>facet_wrap()</code>. They arrange the panels in a single row or single column respectively (like setting <code>nrow</code> or <code>ncol</code> to <code>1</code> in <code>facet_wrap()</code>), but by doing so allows the addition of a <code>space</code> argument as known from <code>facet_grid()</code>. In contrast to using <code>facet_grid()</code> with a single column or row, these new facets retain the <code>facet_wrap()</code> ability of having completely separate scale ranges as well as positioning the facet strip wherever you please:
</p>
<pre class="r"><code>ggplot(mpg) + 
  geom_bar(aes(x = manufacturer)) + 
  facet_col(~drv, scales = 'free_y', space = 'free', labeller = label_both) + 
  coord_flip()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-08-08-a-flurry-of-facets_files/figure-html/unnamed-chunk-12-1.png" width="672">
</p>
<p>
So, these were the flurry of facets I was going to bring you today—I hope you’ll put them to good use and create some awesome visualizations with them.
</p>
<p>
Next up: the next ggraph release!
</p>
</section>
</section>



 ]]></description>
  <category>ggforce</category>
  <category>package</category>
  <category>announcement</category>
  <category>visualization</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2019-08-08-a-flurry-of-facets/</guid>
  <pubDate>Thu, 08 Aug 2019 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/ggforce_logo.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>The ggforce Awakens (again)</title>
  <link>https://tutor-church-15580.netlify.app/posts/2019-03-04-the-ggforce-awakens-again/</link>
  <description><![CDATA[ 




<p>
After what seems like a lifetime (at least to me), a new feature release of ggforce is available on CRAN. ggforce is my general purpose extension package for ggplot2, my first early success, what got me on twitter in the first place, and ultimately instrumental in my career move towards full-time software/R development. Despite this pedigree ggforce haven’t really received much love in the form of a feature release since, well, since it was released. One of the reasons for this is that after the first release I began pushing changes to ggplot2 that allowed for different stuff I wanted to do in ggforce, so the release of the next ggforce version became tied to the release of ggplot2. This doesn’t happen every day, and when it eventually transpired, I was deep in patchwork and gganimate development, and couldn’t take time off to run the last mile with ggforce. In the future I’ll probably be more conservative with my ggplot2 version dependency, or at least keep it out of the main branch until a ggplot2 release is in sight.
</p>
<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/ggforce_release020.png" class="img-fluid" style="width: 100%;">
</p>
<p>
Enough excuses though, a new version is finally here and it’s a glorious one. Let’s celebrate! This version both brings a slew of refinements to existing functionality as well as a wast expanse of new features, so there’s enough to dig into.
</p>
<section id="new-features" class="level2">
<h2 class="anchored" data-anchor-id="new-features">
New features
</h2>
<p>
This is why we’re all here, right? The new and shiny! Let’s get going; the list is pretty long.
</p>
<section id="the-shape-of-geoms" class="level3">
<h3 class="anchored" data-anchor-id="the-shape-of-geoms">
The Shape of Geoms
</h3>
<p>
Many of the new and current geoms and stats in ggforce are really there to allow you to draw different types of shapes easily. This means that the workhorse of these has been <code>geom_polygon()</code>, while ggforce provided the means to describe the shapes in meaningful ways (e.g.&nbsp;wedges, circles, thick arcs). With the new release all of these geoms (as well as the new ones) will use the new <code>geom_shape()</code> under the hood. The shape geom is an extension of the polygon one that allows a bit more flourish in how the final shape is presented. It does this by providing two additional parameters: <code>expand</code> and <code>radius</code>, which will allow fixed unit expansion (and contraction) of the polygons as well as rounding of the corners based on a fixed unit radius. What do I mean with <em>fixed unit</em>? In the same way as the points in <code>geom_point</code> stay the same size during resizing of the plot, so does the corner radius and expansion of the polygon.
</p>
<p>
Let us modify the <code>goem_polygon()</code> example to use <code>geom_shape()</code> to see what it is all about:
</p>
<pre class="r"><code>library(ggforce)

ids &lt;- factor(c("1.1", "2.1", "1.2", "2.2", "1.3", "2.3"))
values &lt;- data.frame(
  id = ids,
  value = c(3, 3.1, 3.1, 3.2, 3.15, 3.5)
)
positions &lt;- data.frame(
  id = rep(ids, each = 4),
  x = c(2, 1, 1.1, 2.2, 1, 0, 0.3, 1.1, 2.2, 1.1, 1.2, 2.5, 1.1, 0.3,
  0.5, 1.2, 2.5, 1.2, 1.3, 2.7, 1.2, 0.5, 0.6, 1.3),
  y = c(-0.5, 0, 1, 0.5, 0, 0.5, 1.5, 1, 0.5, 1, 2.1, 1.7, 1, 1.5,
  2.2, 2.1, 1.7, 2.1, 3.2, 2.8, 2.1, 2.2, 3.3, 3.2)
)
datapoly &lt;- merge(values, positions, by = c("id"))

# Standard look
ggplot(datapoly, aes(x = x, y = y)) +
  geom_polygon(aes(fill = value, group = id))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<pre class="r"><code># Contracted and rounded
ggplot(datapoly, aes(x = x, y = y)) +
  geom_shape(aes(fill = value, group = id), 
             expand = unit(-2, 'mm'), radius = unit(5, 'mm'))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-2-2.png" width="672">
</p>
<p>
If you’ve never needed this, it may be the kind of thing you go <em>why even bother</em>, but if you’ve needed to venture into Adobe Illustrator to add this kind of flourish it is definitely something where you appreciate the lack of this round-trip. And remember: you can stick this at anything that expects a <code>geom_polygon</code> — not just the ones from ggforce.
</p>
</section>
<section id="more-shape-primitives" class="level3">
<h3 class="anchored" data-anchor-id="more-shape-primitives">
More shape primitives
</h3>
<p>
While <code>geom_shape()</code> is the underlying engine for drawing, ggforce adds a bunch of new shape parameterisations, which we will quickly introduce:
</p>
<p>
<code>geom_ellipse</code> makes, you guessed it, ellipses. Apart from standard ellipses it also offers the possibility of making super-ellipses so if you’ve been dying to draw those with ggplot2, now is your time to shine.
</p>
<pre class="r"><code># Not an ordinary ellipse — a super-ellipse
ggplot() +
  geom_ellipse(aes(x0 = 0, y0 = 0, a = 6, b = 3, angle = -pi / 3, m1 = 3)) +
  coord_fixed()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-3-1.png" width="672">
</p>
<p>
<code>geom_bspline_closed</code> allows you to draw closed b-splines. It takes the same type of input as <code>geom_polygon</code> but calculates a closed b-spline from the corner points instead of just connecting them.
</p>
<pre class="r"><code># Create 6 random control points
controls &lt;- data.frame(
  x = runif(6),
  y = runif(6)
)

ggplot(controls, aes(x, y)) +
  geom_polygon(fill = NA, colour = 'grey') +
  geom_point(colour = 'red') +
  geom_bspline_closed(alpha = 0.5)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-4-1.png" width="672">
</p>
<p>
<code>geom_regon</code> draws regular polygons of a set radius and number of sides.
</p>
<pre class="r"><code>ggplot() +
  geom_regon(aes(x0 = runif(8), y0 = runif(8), sides = sample(3:10, 8),
                 angle = 0, r = runif(8) / 10)) +
  coord_fixed()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-5-1.png" width="672">
</p>
<p>
<code>geom_diagonal_wide</code> draws thick diagonals (quadratic bezier paths with the two control points pointing towards each other but perpendicular to the same axis)
</p>
<pre class="r"><code>data &lt;- data.frame(
  x = c(1, 2, 2, 1, 2, 3, 3, 2),
  y = c(1, 2, 3, 2, 3, 1, 2, 5),
  group = c(1, 1, 1, 1, 2, 2, 2, 2)
)

ggplot(data) +
  geom_diagonal_wide(aes(x, y, group = group))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
</section>
<section id="is-it-a-sankey-is-it-an-alluvial-no-its-a-parallel-set" class="level3">
<h3 class="anchored" data-anchor-id="is-it-a-sankey-is-it-an-alluvial-no-its-a-parallel-set">
Is it a Sankey? Is it an Alluvial? No, It’s a Parallel Set
</h3>
<p>
Speaking of diagonals, one of the prime uses of this is for creating parallel sets visualizations. There’s a fair bit of nomenclature confusion with this, so you may know this as Sankey diagrams, or perhaps alluvial plots. I’ll insist that Sankey diagrams are specifically for following flows (and often employs a more loose positioning of the axes) and alluvial plots are for following temporal changes, but we can all be friends no matter what you call it. ggforce allows you to create parallel sets plots with a standard layered geom approach (for another approach to this problem, see <a href="https://github.com/corybrunson/ggalluvial">the ggalluvial package</a>). The main problem is that data for parallel sets plots are usually not represented very well in the tidy format expected by ggplot2, so ggforce further provides a reshaping function to get the data in line for plotting:
</p>
<pre class="r"><code>titanic &lt;- reshape2::melt(Titanic)
# This is how we usually envision data for parallel sets
head(titanic)</code></pre>
<pre><code>##   Class    Sex   Age Survived value
## 1   1st   Male Child       No     0
## 2   2nd   Male Child       No     0
## 3   3rd   Male Child       No    35
## 4  Crew   Male Child       No     0
## 5   1st Female Child       No     0
## 6   2nd Female Child       No     0</code></pre>
<pre class="r"><code># Reshape for putting the first 4 columns as axes in the plot
titanic &lt;- gather_set_data(titanic, 1:4)
head(titanic)</code></pre>
<pre><code>##   Class    Sex   Age Survived value id     x    y
## 1   1st   Male Child       No     0  1 Class  1st
## 2   2nd   Male Child       No     0  2 Class  2nd
## 3   3rd   Male Child       No    35  3 Class  3rd
## 4  Crew   Male Child       No     0  4 Class Crew
## 5   1st Female Child       No     0  5 Class  1st
## 6   2nd Female Child       No     0  6 Class  2nd</code></pre>
<pre class="r"><code># Do the plotting
ggplot(titanic, aes(x, id = id, split = y, value = value)) +
  geom_parallel_sets(aes(fill = Sex), alpha = 0.3, axis.width = 0.1) +
  geom_parallel_sets_axes(axis.width = 0.1) +
  geom_parallel_sets_labels(colour = 'white')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<p>
As can be seen, the parallel sets plot consist of several layers, which is something required for many, more involved, composite plot types. Separating them into multiple layers gives you more freedom without over-poluting the argument and aesthetic list.
</p>
</section>
<section id="the-markings-of-a-great-geom" class="level3">
<h3 class="anchored" data-anchor-id="the-markings-of-a-great-geom">
The markings of a great geom
</h3>
<p>
If there is one thing of general utility lacking in ggplot2 it is probably the ability to annotate data cleanly. Sure, there’s <code>geom_text()</code>/<code>geom_label()</code> but using them requires a fair bit of fiddling to get the best placement and further, they are mainly relevant for labeling and not longer text. <code>ggrepel</code> has improved immensely on the fiddling part, but the lack of support for longer text annotation as well as annotating whole areas is still an issue.
</p>
<p>
In order to at least partly address this, ggforce includes a family of geoms under the <code>geom_mark_*()</code> moniker. They all behaves equivalently except for how they encircle the given area(s). The 4 different geoms are:
</p>
<ul>
<li>
<code>geom_mark_rect()</code> encloses the data in the smallest enclosing rectangle
</li>
<li>
<code>geom_mark_circle()</code> encloses the data in the smallest enclosing circle
</li>
<li>
<code>geom_mark_ellipse()</code> encloses the data in the smallest enclosing ellipse
</li>
<li>
<code>geom_mark_hull()</code> encloses the data with a concave or convex hull
</li>
</ul>
<p>
All the enclosures are calculated at draw time so respond to resizing (most are susceptible to changing aspect ratios), and further uses <code>geom_shape()</code> with a default expansion and radius set, so that the enclosure is always slightly larger than the data it needs to enclose.
</p>
<p>
Just to give a quick sense of it, here’s an example of <code>geom_mark_ellipse()</code>
</p>
<pre class="r"><code>ggplot(iris, aes(Petal.Length, Petal.Width)) +
  geom_mark_ellipse(aes(fill = Species)) +
  geom_point()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
<p>
If you simply want to show the area where different classes appear, we’re pretty much done now, as the shapes along with the legend tells the story. But I promised you some more: textual annotation. So how does this fit into it all?
</p>
<p>
In addition to the standard aesthetics for shapes, the mark geoms also take a <code>label</code> and <code>description</code> aesthetic. When used, things get interesting:
</p>
<pre class="r"><code>ggplot(iris, aes(Petal.Length, Petal.Width)) +
  geom_mark_ellipse(aes(fill = Species, label = Species)) +
  geom_point()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-9-1.png" width="672">
</p>
<p>
The text is placed automatically so that it does not overlap with any data used in the layer, and it responds once again to resizing, always trying to find the most optimal placement of the text. If it is not possible to place the desired text it elects to not show it at all.
</p>
<p>
Anyway, in the plot above we have an overabundance of annotation. Both the legend and the labels. Further, we often want to add annotations to specific data in the plot, not all of it. We can put focus on setosa by ignoring the other groups:
</p>
<pre class="r"><code>desc &lt;- 'This iris species has a markedly smaller petal than the others.'
ggplot(iris, aes(Petal.Length, Petal.Width)) +
  geom_mark_ellipse(aes(filter = Species == 'setosa', label = 'Setosa', 
                        description = desc)) +
  geom_point()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-10-1.png" width="672">
</p>
<p>
We are using another one of the mark geom family’s tricks here, which is the filter aesthetic. It makes it quick to specify the data you want to annotate, but in addition the remaining data is remembered so that any annotation doesn’t overlap with it even if it is not getting annotated (you wouldn’t get this if you pre-filtered the data for the layer). Another thing that happens behind the lines is that the <code>description</code> text automatically gets word wrapping, based on a desired width of the text-box (defaults to 5 cm).
</p>
<p>
The mark geoms offer a wide range of possibilities for styling the annotation, too many to go into detail with here, but rest assured that you have full control over text appearance, background, line, distance between data and text-box etc.
</p>
</section>
<section id="lost-in-tessellation" class="level3">
<h3 class="anchored" data-anchor-id="lost-in-tessellation">
Lost in Tessellation
</h3>
<p>
The last of the big additions in this release is a range of geoms for creating and plotting Delaunay triangulation and Voronoi tessellation. How often do you need that, you ask? Maybe never… Does it look wicked cool? Why, yes!
</p>
<p>
Delaunay triangulation is a way to connect points to their nearest neighbors without any connections overlapping. By nature, this results in triangles being created. This data can either be thought of as a set of triangles, or a set of line segments, and ggforce provides both through the <code>geom_delaunay_tile()</code> and <code>geom_delaunay_segment()</code> geoms. Further, a <code>geom_delaunay_segment2()</code> version exists that mimics <code>geom_link2</code> in allowing aesthetic interpolation between endpoints.
</p>
<p>
As we are already quite acquainted with the Iris dataset, let’s take it for a whirl again:
</p>
<pre class="r"><code>ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
  geom_delaunay_tile(alpha = 0.3) + 
  geom_delaunay_segment2(aes(colour = Species, group = -1), size = 2,
                         lineend = 'round')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-11-1.png" width="672">
</p>
<p>
The triangulation is not calculated at draw time and is thus susceptible to range differences on the x and y axes. To combat this it is possible to normalize the position data before calculating the triangulation.
</p>
<p>
Voronoi tessellation is sort of an inverse of Delaunay triangulation. it draws perpendicular segments in the middle of all the triangulation segments and connects the neighboring ones. The end result is a tile around each point marking the area where the point is the closest one. In parallel to the triangulation, Voronoi also comes with both a tile and a segment version.
</p>
<pre class="r"><code>ggplot(iris, aes(Sepal.Length, Sepal.Width)) + 
  geom_voronoi_tile(aes(fill = Species, group = -1L)) + 
  geom_voronoi_segment() +
  geom_point()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-12-1.png" width="672">
</p>
<p>
We need to set the <code>group</code> aesthetic to a scalar in order to force all points to be part of the same tessellation. Otherwise each group would get its own:
</p>
<pre class="r"><code>ggplot(iris, aes(Sepal.Length, Sepal.Width)) + 
  geom_voronoi_tile(aes(fill = Species), colour = 'black')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-13-1.png" width="672">
</p>
<p>
Let’s quickly move on from that…
</p>
<p>
As a Voronoi tessellation can in theory expand forever, we need to define a bounding box. The default is to expand an enclosing rectangle 10% to each side, but you can supply your own rectangle, or even an arbitrary polygon. Further, it is possible to set a radius bound for each point instead:
</p>
<pre class="r"><code>ggplot(iris, aes(Sepal.Length, Sepal.Width)) + 
  geom_voronoi_tile(aes(fill = Species, group = -1L), max.radius = 0.2,
                    colour = 'black')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-14-1.png" width="672">
</p>
<p>
This functionality is only available for the tile geom, not the segment, but this will hopefully change with a later release.
</p>
<p>
A last point, just to beat a dead horse, is that the tile geoms of course inherits from <code>geom_shape()</code> so if you like them rounded corners you can have it your way:
</p>
<pre class="r"><code>ggplot(iris, aes(Sepal.Length, Sepal.Width)) + 
  geom_voronoi_tile(aes(fill = Species, group = -1L), max.radius = 1,
                    colour = 'black', expand = unit(-0.5, 'mm'), 
                    radius = unit(0.5, 'mm'), show.legend = FALSE)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-15-1.png" width="672">
</p>
</section>
<section id="zoom" class="level3">
<h3 class="anchored" data-anchor-id="zoom">
Zoom
</h3>
<p>
Not a completely new feature as the ones above, but <code>facet_zoom()</code> has gained enough new power to warrant a mention. The gist of the facet is that it allows you to zoom in on an area of the plot while keeping the original view as a separate panel. The old version only allowed specifying the zoom region by providing a logical expression that indicated what data should be part of the zoom, but it now has a dedicated <code>xlim</code> and <code>ylim</code> arguments to set them directly.
</p>
<pre class="r"><code>ggplot(diamonds) + 
  geom_histogram(aes(x = price), bins = 50) + 
  facet_zoom(xlim = c(3000, 5000), ylim = c(0, 2500), horizontal = FALSE)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-16-1.png" width="672">
</p>
<p>
The example above shows a shortcoming in simply zooming in on a plot. Sometimes the resolution (here, bins) aren’t really meaningful for zooming. Because of this, <code>facet_zoom()</code> has gotten a <code>zoom.data</code> argument to indicate what data to put on the zoom panel and what to put on the overview panel (and what to put in both places). It takes a logical expression to evaluate on the data and if it returns <code>TRUE</code> the data is put in the zoom panel, if it returns <code>FALSE</code> it is put on the overview panel, and if it returns <code>NA</code> it is put in both. To improve the visualization above, well add two layers with different number of bins and use <code>zoom.data</code> to put them in the right place:
</p>
<pre class="r"><code>ggplot() + 
  geom_histogram(aes(x = price), dplyr::mutate(diamonds, z = FALSE), bins = 50) + 
  geom_histogram(aes(x = price), dplyr::mutate(diamonds, z = TRUE), bins = 500) + 
  facet_zoom(xlim = c(3000, 5000), ylim = c(0, 300), zoom.data = z,
             horizontal = FALSE) + 
  theme(zoom.y = element_blank(), validate = FALSE)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-03-04-the-ggforce-awakens-again_files/figure-html/unnamed-chunk-17-1.png" width="672">
</p>
<p>
The last flourish we did above was to remove the zoom indicator for the y axis zoom by using the <code>zoom.y</code> theme element. We currently need to turn off validation for this to work as ggplot2 by default doesn’t allow unknown theme elements.
</p>
</section>
</section>
<section id="all-the-rest" class="level2">
<h2 class="anchored" data-anchor-id="all-the-rest">
All the rest
</h2>
<p>
The above is just the most worthwhile, but the release also includes a slew of other features and improvements. Notable mentions are
</p>
<ul>
<li>
<code>geom_sina()</code> rewrite to allow dodging and follow the shape of <code>geom_violin()</code>
</li>
<li>
<code>position_jitternormal()</code> that jitters points based on a normal distribution instead of a uniform one
</li>
<li>
<code>facet_stereo()</code> to allow for faux 3D plots
</li>
</ul>
<p>
See the <a href="https://ggforce.data-imaginist.com/news/index.html"><code>NEWS.md</code></a> file for the full list.
</p>
<p>
Further, ggforce now has a website at <a href="https://ggforce.data-imaginist.com" class="uri">https://ggforce.data-imaginist.com</a>, with full documentation overview etc. This is something I plan to roll out to all my major packages during the next release cycle. I’ve found that it is a great incentive to improve the examples in the documentation!
</p>
<p>
I do hope that it won’t take another two years before ggforce sees the next big update. It is certainly a burden of my shoulder to get this out of the door and I hope I can adhere to smaller, more frequent, releases in the future.
</p>
<p>
Now go get plotting!
</p>
</section>



 ]]></description>
  <category>announcement</category>
  <category>ggforce</category>
  <category>visualization</category>
  <category>package</category>
  <category>R</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2019-03-04-the-ggforce-awakens-again/</guid>
  <pubDate>Thu, 07 Mar 2019 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/ggforce_release020.png" medium="image" type="image/png" height="82" width="144"/>
</item>
<item>
  <title>gganimate has transitioned to a state of release</title>
  <link>https://tutor-church-15580.netlify.app/posts/2019-01-02-gganimate-has-transitioned-to-a-state-of-release/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/gganimate_logo_small.png" align="right" style="width:50%;max-width:200px;margin-left:5pt">
</p>
<p>
Just to start of the year in a positive way, I’m happy to announce that <em>gganimate</em> is now available on CRAN. This release is the result of a pretty focused development starting in the spring 2018 prior to my useR keynote about it.
</p>
<section id="some-history" class="level2">
<h2 class="anchored" data-anchor-id="some-history">
Some History
</h2>
<p>
The <em>gganimate</em> package has been around for quite some time now with David Robinson making <a href="https://github.com/thomasp85/gganimate/commit/81e5e95b33e90b0222314dd2ac187a749596dab0">the first commit</a> in early 2016. David’s vision of gganimate revolved around the idea of frame-as-an-aesthetic and this easy-to-grasp idea gave it an early success. The version developed by David never made it to CRAN, and as part of ramping down his package development he asked me if I was interested in taking over maintenance. I was initially reluctant because I wanted a completely different API, but he insisted that he supported a complete rewrite. The last version of gganimate as maintained by David <a href="https://github.com/thomasp85/gganimate/releases/tag/v0.1.1">is still available</a> but I very quickly made some <a href="https://github.com/thomasp85/gganimate/commit/ec82efbad04b9f19f125d9f3537bffbb202c38ac?diff=unified">drastic changes</a>:
</p>
<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/burn-it-down.png">
</p>
<p>
While this commit was done in the autumn 2017, nothing further happened until I decided to make gganimate the center of my useR 2018 keynote, at which point I was forced (by myself) to have some sort of package ready by the summer of 2018.
</p>
<p>
A fair amount of users have shown displeasure in the breaking changes this history has resulted in. Many blog posts have already been written focusing on the old API, as well as code on numerous computers that will no longer work. I understand this frustration, of course, but both me and David agreed that doing it this way was for the best in the end. I’m positive that the new API has already greatly exceeded the mind-share of the old API and given a year the old API will be all but a distant memory…
</p>
</section>
<section id="the-grammar" class="level2">
<h2 class="anchored" data-anchor-id="the-grammar">
The Grammar
</h2>
<p>
Such drastic breaking changes were required because of a completely different vision for how animation fitted into the grammar of graphics. Davids idea was that it was essentially a third dimension in the graphic and the animation was simply flipping through slices along the third dimension in the same way as you would look through the output of a CT scan. Me, on the other hand, wanted a grammar that existed in parallel to the grammar of graphics — not as part of it.
</p>
<p>
My useR keynote goes in to a lot of detail about my motivation and inspiration for taking on this approach, and I’ll not rehash it in this release post. Feel free to take a 1h break from reading as you watch the talk
</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/21ZWDrTukEs" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="">
</iframe>
<p>
The gist of it all is that animations are a multifaceted beast and requires much more than an additional aesthetic to be tamed. One of the cornerstones of the talk is the separation of animations into scenes and segues. In short, a segue is an animated change in the underlying laws of the graphic (e.g.&nbsp;changes to coordinate systems, scales, mappings, etc.), whereas a scene is a change in the data on display. Scenes are concerned with <em>what</em> and segues are concerned with <em>how</em>. This separation is important for several reasons: It gives me a natural focus area for the current version of gganimate (scenes), it serves as a theoretical backbone to group animation operation, and it is a central limit in animation good practices: “You should never change <em>how</em> and <em>what</em> at the same time”.
</p>
<p>
So, the version I’m presenting here is a grammar of animation uniquely focused on <em>scenes</em>. This does not mean that I’ll never look into segues, but they are both much harder, and less important than getting a scene grammar to make sense, so segues have to play second fiddle for now.
</p>
<section id="whats-in-a-scene" class="level3">
<h3 class="anchored" data-anchor-id="whats-in-a-scene">
What’s in a scene
</h3>
<p>
There are two main components to a scene: What we are looking at, and where we are looking from. The former is handled by <em>transitions</em> and <em>shadows</em>, whereas the latter is handled by <em>views</em>. In brief:
</p>
<ul>
<li>
<strong>transitions</strong> populates the frames of the animation with data, based on the data assigned to each layer. Several different transitions exists that interpret the layer data differently.
</li>
<li>
<strong>shadows</strong> gives memory to each frame by letting each frame include data from prior or future frames.
</li>
<li>
<strong>views</strong> allow you to modify the range of the positional scales (zoom and pan) either directly or as a function of the data assigned to the frame.
</li>
</ul>
<p>
On top of these three main grammar components there is a range of functions to modify how key parts of animations behave — for a general introduction to the ins and outs of the API, please see the <a href="https://gganimate.com/articles/gganimate.html">*Getting Started**</a> guide.
</p>
</section>
</section>
<section id="grammar-vs-api" class="level2">
<h2 class="anchored" data-anchor-id="grammar-vs-api">
Grammar vs API
</h2>
<p>
While it may appear that grammar and API are the same, this is not the case. A grammar is a theoretical construct, a backbone from which an API can be defined. Several APIs could implement the same grammar in multiple, incompatible, ways. For gganimate I have tried to align the API as much as possible with the ggplot2 API, so that the line between the two packages becomes blurred. You change a plot to an animation by adding functions from gganimate to it, and the animation is rendered when printing the animation object in the same way as ggplots are rendered when printing the object. An example of this is adding <code>transition_reveal()</code> to a plot to make it appear gradually along a numeric variable:
</p>
<pre class="r"><code>library(ggplot2)
library(gganimate)

ggplot(airquality) + 
  geom_line(aes(x = Day, y = Temp, group = Month))</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-01-02-gganimate-has-transitioned-to-a-state-of-release_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<pre class="r"><code>last_plot() + 
  transition_reveal(Day)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-01-02-gganimate-has-transitioned-to-a-state-of-release_files/figure-html/unnamed-chunk-3-1.gif"><!-- -->
</p>
<p>
For the most part, the marriage between the ggplot2 and gganimate APIs is a happy one, though it does show at points that the ggplot2 API was never designed with animation in mind. I am particularly pleased with how powerful the API has turned out, and I have already seen countless uses I had never anticipated.
</p>
</section>
<section id="making-fireworks" class="level2">
<h2 class="anchored" data-anchor-id="making-fireworks">
Making Fireworks
</h2>
<p>
While a proper introduction to its use is better kept for a separate document (such as the <em>Getting Started</em> guide mentioned earlier), I think I would do gganimate a disservice by not showing of at least a single fully fledged example. Below is the code needed to make fireworks with gganimate:
</p>
<pre class="r"><code># Firework colours
colours &lt;- c(
  'lawngreen',
  'gold',
  'white',
  'orchid',
  'royalblue',
  'yellow',
  'orange'
)
# Produce data for a single blast
blast &lt;- function(n, radius, x0, y0, time) {
  u &lt;- runif(n, -1, 1)
  rho &lt;- runif(n, 0, 2*pi)
  x &lt;- radius * sqrt(1 - u^2) * cos(rho) + x0
  y &lt;- radius * sqrt(1 - u^2) * sin(rho) + y0
  id &lt;- sample(.Machine$integer.max, n + 1)
  data.frame(
    x = c(x0, rep(x0, n), x0, x),
    y = c(0, rep(y0, n), y0, y),
    id = rep(id, 2),
    time = c((time - y0) * runif(1), rep(time, n), time, time + radius + rnorm(n)),
    colour = c('white', rep(sample(colours, 1), n), 'white', rep(sample(colours, 1), n)),
    stringsAsFactors = FALSE
  )
}
# Make 20 blasts
n &lt;- round(rnorm(20, 30, 4))
radius &lt;- round(n + sqrt(n))
x0 &lt;- runif(20, -30, 30)
y0 &lt;- runif(20, 40, 80)
time &lt;- runif(20, max = 100)
fireworks &lt;- Map(blast, n = n, radius = radius, x0 = x0, y0 = y0, time = time)
fireworks &lt;- dplyr::bind_rows(fireworks)</code></pre>
<p>
All of the above is just data preparation. <code>blast()</code> simply creates segments from the center of the blast and out to the periphery, sampling colours from the <code>colour</code> vector. The end result, if plotted statically, looks like this:
</p>
<pre class="r"><code>ggplot(fireworks) + 
  geom_path(aes(x = x, y = y, group = id, colour = colour)) + 
  scale_colour_identity()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-01-02-gganimate-has-transitioned-to-a-state-of-release_files/figure-html/unnamed-chunk-5-1.png" width="672">
</p>
<p>
Now, to make it all move, as well as style it a bit for a better effect
</p>
<pre class="r"><code>ggplot(fireworks) + 
  geom_point(aes(x, y, colour = colour, group = id), size = 0.5, shape = 20) + 
  scale_colour_identity() + 
  coord_fixed(xlim = c(-65, 65), expand = FALSE, clip = 'off') +
  theme_void() + 
  theme(plot.background = element_rect(fill = 'black', colour = NA), 
        panel.border = element_blank()) + 
  # Here comes the gganimate code
  transition_components(time, exit_length = 20) + 
  ease_aes(x = 'sine-out', y = 'sine-out') + 
  shadow_wake(0.05, size = 3, alpha = TRUE, wrap = FALSE, 
              falloff = 'sine-in', exclude_phase = 'enter') + 
  exit_recolour(colour = 'black')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2019-01-02-gganimate-has-transitioned-to-a-state-of-release_files/figure-html/unnamed-chunk-6-1.gif"><!-- -->
</p>
<p>
While I wont go into detail, <code>transition_component()</code> allow all the points to follow their own trajectory and timeline independently, <code>ease_aes()</code> ensures that the velocity of the points taper off, <code>shadow_wake()</code> is responsible for the trail after each point, and <code>exit_recolour()</code> makes sure the points gradually fades into the black background once they “burn out”.
</p>
</section>
<section id="the-future" class="level2">
<h2 class="anchored" data-anchor-id="the-future">
The Future
</h2>
<p>
While this release is a milestone for gganimate, it is not a signal of it <em>being done</em> as many things are still missing (even if we ignore the whole segue part of the grammar). It does signal a commitment to stability from now on, though so you should feel confident in using this package without fearing that your code will break in the future. You can follow the state of the package at its website, &lt;www.gganimate.com&gt;, where I’ll also try to add additional guides and tutorials with time. If you create something with gganimate please share it on twitter, as I’m eager to see what people will make of it.
</p>
<p>
I’ll do a sort-of live cookbook talk on gganimate at this years RStudio conf in Austin, so if you are there and interested to learn more about the package do swing by.
</p>
<p>
Now, Go Animate!
</p>
</section>



 ]]></description>
  <category>announcement</category>
  <category>animation</category>
  <category>gganimate</category>
  <category>ggplot2</category>
  <category>package</category>
  <category>visualization</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2019-01-02-gganimate-has-transitioned-to-a-state-of-release/</guid>
  <pubDate>Thu, 03 Jan 2019 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/gganimate_logo.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>Entering and Exiting 2018</title>
  <link>https://tutor-church-15580.netlify.app/posts/2018-12-28-entering-and-exiting-2018/</link>
  <description><![CDATA[ 




<p>
The year is nearly over and it is the time for reflection and navel-gazing. I don’t have incredibly profound things to say, but a lot of things happened in 2018 and this is as good a time as any to go through it all…
</p>
<section id="picking-myself-up" class="level2">
<h2 class="anchored" data-anchor-id="picking-myself-up">
Picking Myself Up
</h2>
<p>
The prospects of my <a href="https://www.data-imaginist.com/2017/looking-back-on-2017/">“2017 in review”</a> post were not particularly rosy… I had hit somewhat of a burnout in terms of programming, but was none the less positive and had a great job and a lot of positive feedback on <a href="https://github.com/thomasp85/patchwork">patchwork</a>. Further, I had RStudio::conf to look forward to, which would be my first IRL head-to-head with the R community at large. I had also promised to present a fully-fledged tidy approach to network analysis and while both <a href="https://github.com/thomasp85/ggraph">ggraph</a> and <a href="https://github.com/thomasp85/tidygraph">tidygraph</a> had already been released there were things I wanted to develop prior to presenting it. All-in-all there was a great impediment to pick myself up and get on with developing (not arguing that this is a fail-safe way to deal with burnout by the way).
</p>
</section>
<section id="rstudioconf2018l" class="level2">
<h2 class="anchored" data-anchor-id="rstudioconf2018l">
RStudio::conf(2018L)
</h2>
<p>
My trip to San Diego was amazing. If you ever get to go to an RStudio conference I don’t think you will be disappointed (full disclosure and spoiler-alert: I now work for RStudio). My suspicion that the R community is as amazing in real life as on Twitter was confirmed and it was great to finally get to see all those people I admire and look up to. My talk went fairly well I think — I haven’t watched the recordings as I don’t particularly enjoy watching myself talk, <a href="https://www.rstudio.com/resources/videos/tidying-up-your-network-analysis-with-tidygraph-and-ggraph/">but you can</a>, if you are so inclined. At the conference I got to chat a bit with Jenny Bryan (one of the admire/look-up-to people referenced above) and we discussed what we were going to talk about in our respective keynotes at useR in Brisbane in the summer. I half-jokingly said that I might talk about <a href="https://gganimate.com">gganimate</a> because that would give me the required push to actually begin developing it…
</p>
</section>
<section id="talk-driven-development" class="level2">
<h2 class="anchored" data-anchor-id="talk-driven-development">
Talk-Driven Development
</h2>
<p>
Around April Dianne Cook was getting pushy with getting at least a talk title for my keynote, and at that point I had already imagined a couple of slides on gganimate and thought “to heck-with-it” and responded with the daunting title of <em>The Grammar of Animation</em>. At that point I had still not written a single line of code for gganimate, and knew that <a href="https://github.com/thomasp85/tweenr">tweenr</a> would need a serious update to support what I had in mind. In addition, I knew I had to develop what ended up as <a href="https://github.com/thomasp85/transformr">transformr</a> before I could begin with gganimate proper. All-in-all my talk title could not be more stress-inducing…
</p>
<p>
Thankfully I had a pretty clear vision in my head (which was also why I wanted to talk about it) so the motivation was there to drag me along for the ride. Another great benefit of developing tools for data visualisation in general and animation in particular, is that it sets Twitter on fire. After getting tweenr and transformr into a shape sufficient to support gganimate, I began to create the backbone of the package, and once I shared the first animation created with it, it was clear that I was in the pursuit of something that resonated with a lot of people.
</p>
<p>
To my great surprise I was able to get gganimate to a state where it actually supported the main grammar I had in mind prior useR, and I could begin to make the presentation I had in mind:
</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/21ZWDrTukEs" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="">
</iframe>
<p>
useR was a great experience, not only because I was able to give the talk I had hoped for, but also due to the greatness of the organisers and the attendees. I was able to get to meet a lot of the members of R Core for the first time and they were very supportive of my quest to improve the performance of the R graphic stack (last slide of my talk), so I had high hopes that this might be achievable within the next 5-10 years (it is no small task). I had been surprised about the support for my ideas about animations and their relevance within the R community, so in general the conference left my invigorated and with the stamina to complete gganimate.
</p>
</section>
<section id="intermezzo" class="level2">
<h2 class="anchored" data-anchor-id="intermezzo">
Intermezzo
</h2>
<p>
I managed to release a couple of packages that do not fit into the narrative I’m trying to create for this year, but they deserve a mention none the less.
</p>
<p>
In the beginning of the year I was able to finish of <a href="https://github.com/thomasp85/particles">particles</a>, a port and extension of the d3-force algorithm developed by Mike Bostock. It can be used for both great fun and work and did among other things result in this beautiful pixel-decomposition of Hadley:
</p>
<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/hadley.gif">
</p>
<p>
While making improvements to tweenr in anticipation of gganimate it became clear that colour conversion was a main bottleneck and I ended up developing <a href="https://github.com/thomasp85/farver">farver</a> to improve on this. Beyond very fast colour conversion it also allow a range of different colour distance calculations to be performed. Some of the discussion that followed the development of this package led to Brodie Gaslam improving the colour conversion performance in base R and while it is not as fast as farver, it is pretty close and future versions of R will definetly benefit from his great contribution.
</p>
<p>
I haven’t had much time to make generative art this year, but I did manage to find time for some infrastructure work that will support my endavours in this space in the future. The <a href="https://github.com/thomasp85/ambient">ambient</a> package is able to produce all sorts of multidimensional noise in a very performant way due to the speed of the underlying C++ library. I’m planning to expand on this package quite a bit when I get the time as I have lots of cool ideas for how to manipulate noise in a tidy manner.
</p>
<p>
How you use colours in data visualisation is extremely important, which is also why the data visualisation community has embraced the viridis colour scale to the extend that they have. I’ve personally grown tired of the aesthetic though, so when I saw a range of perceptualy uniform palettes developed by Fabio Crameri was quick to bring it to R with the <a href="https://github.com/thomasp85/scico">scico</a> package. To my surprise the development of a colour palette packages became my most contentious contribution this year (that I know of), so I welcome everyone who is tired of colour palette packages to ignore it alltogether.
</p>
</section>
<section id="transition_hobby_work" class="level2">
<h2 class="anchored" data-anchor-id="transition_hobby_work">
transition_hobby_work()
</h2>
<p>
Prior to useR I had began to receives some cryptic questions from Hadley and it was clear that he was either trolling my or that something was brewing. During the late summer it became clear that it was the latter (thankfully), as RStudio wanted me to work full time on improving the R graphic stack. Working for RStudio on something so aligned with my own interest is beyond what I had hoped for, so despite my joy in working for the danish tax authorities the switch was a no-brainer. I wish my former office all the best — they are doing incredible work - and look forward to seeing some of them at RStudio conf in Austin later in the month.
</p>
<p>
Being part of the tidyverse team has so far been a great experience. I’ve been lucky enough to meet several of them already as part of the different conferences I attended this year, so working remotely with them doesn’t feel that strange. It can be intimidating to work with such a talented team, but if that is the least of my concerns I’m pretty sure I can manage that.
</p>
<p>
I look forward to share the performance improvements I’m making with all of you throughout the coming years, and hopefully I’ll have time to also improve on some of my packages that has received less attention during the development of gganimate.
</p>
<p>
Happy New Year!
</p>
</section>



 ]]></description>
  <category>year-in-review</category>
  <category>gganimate</category>
  <category>tweenr</category>
  <category>transformr</category>
  <category>tidygraph</category>
  <category>ggraph</category>
  <category>scico</category>
  <category>ambient</category>
  <category>farver</category>
  <category>particles</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2018-12-28-entering-and-exiting-2018/</guid>
  <pubDate>Wed, 02 Jan 2019 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/logo.png" medium="image" type="image/png" height="144" width="144"/>
</item>
<item>
  <title>transformr: Age of Spatial</title>
  <link>https://tutor-church-15580.netlify.app/posts/2018-12-03-transformr-age-of-spatial/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/transformr_logo_small.png" align="right" style="width:50%;max-width:200px;">
</p>
<p>
Once again, I gives me great pleasure to announce a new package has joined CRAN. <code>transformr</code> is the spatial brother of <code>tweenr</code> and as with the <code>tweenr</code> update a few months ago, this package is very much driven by the infrastructural needs of <code>gganimate</code>. It is probably the last piece needed before I can begin preparing <code>gganimate</code> for CRAN, so if you are waiting for that there is indeed reason for celebration.
</p>
<section id="becoming-spatial" class="level2">
<h2 class="anchored" data-anchor-id="becoming-spatial">
Becoming Spatial
</h2>
<p>
As written above, <code>transformr</code> is <code>tweenr</code> for spatial data (spatial being used in a very broad sense as any data that is partly coordinates). To understand what this means we’ll briefly have to touch on a core concept of <code>tweenr</code>. What is never said out loud, but generally implied, is that <code>tweenr</code> treats all columns of the data frame as independent. This is generally a sound principle as you don’t want values from other columns to influence how e.g.&nbsp;the colour transitions between black and blue. As far as spatial is concerned, this approach also works fine as each row in the data frame encodes a single independent point in space or if there’s a one-to-one mapping between points in a polygon. Alas, the devil’s in the detail, and <code>tweenr</code> breaks down in magnificent ways if you try to tween between more complicated and heterogeneous shapes, e.g.&nbsp;a star and a circle. This is not something unique to <code>tweenr</code>, mind you, d3.js also has this limitation. The problems in d3 led <a href="https://github.com/veltman">Noah Veltman</a> to develop the <a href="https://github.com/veltman/flubber">flubber</a> javascript library. His reasons for developing it is succintly described in the animation below, grabbed from the readme of flubber
</p>
<p>
<img src="https://user-images.githubusercontent.com/2120446/27014160-e0ce7c04-4ea7-11e7-8da4-5dde839290eb.gif">
</p>
</section>
<section id="the-trials-of-the-polygon" class="level2">
<h2 class="anchored" data-anchor-id="the-trials-of-the-polygon">
The Trials of the Polygon
</h2>
<p>
So, what’s the deal with polygons exactly. Why don’t they just do as you expect them to and morph naturally from one to the other. That sad state of affair is that there are multiple reasons for that:
</p>
<ol style="list-style-type: decimal">
<li>
There might be discrepancy between the number of points that make up the two polygons. This may lead to part of the shape simply appearing or disappearing at the start or end of the tween.
</li>
<li>
The winding of the polygons may have a different angular offset and/or direction. This means that the tween will include rotatation and/or inversion, something that is often undesirable.
</li>
<li>
There may be a discrepancy in the number of polygons that make up the two shapes you tween between and/or a discrepancy between the number of holes. As with 1. this may lead to parts of the shapes suddenly appearing or disappearing during the tween.
</li>
</ol>
</section>
<section id="running-the-gauntlet" class="level2">
<h2 class="anchored" data-anchor-id="running-the-gauntlet">
Running the Gauntlet
</h2>
<p>
<code>transformr</code> tries to solve the three problems above in much the same way as flubber does, at least conceptually. There are enough differences between how Javascript and R (as well as d3 and <code>tweenr</code>) works with data, that I decided to only take the ideas behind flubber and implement them in my own way, in a manner fitting for R, rather than doing a direct port of the library. This means that you cannot expect the two libraries to behave equivalently. Below is, at a very high level, what <code>transformr</code> does to address the 3 problems outlined above:
</p>
<ol style="list-style-type: decimal">
<li>
Points are added along the edges of the shape with the fewest corners until the number of points matches between the shapes. Points are added so that long edges will be divided more often than short edges in order to even out the edge lengths of the final shape. Further, if any shape has fewer than a given number of corners, points will be added (following the same strategy) until the number of corners is reached.
</li>
<li>
After the number of points are evened out, the winding direction is matched between the shapes (as clockwise), and the last shape is rotated until the squared distance between point pairs of the two shapes is minimised.
</li>
<li>
This is adressed first (but is the least prevalent problem so it is mentioned last). If there are different number of polygons in the two states you wish to tween between, the polygons in the state with the fewest polygons is cut until the number matches. Once again, the cuts are distributed so that large polygons are cut more often than small. After the cutting, polygons between the states are matched by minimising distance and area difference. If there are differences in the number of holes in the matched polygons zero-area holes are inserted at the gravitational center of the polygon with the fewest holes until the number matches.
</li>
</ol>
</section>
<section id="the-ways-of-the-transformr" class="level2">
<h2 class="anchored" data-anchor-id="the-ways-of-the-transformr">
The Ways of the Transformr
</h2>
<p>
At this point we have only talked about shapes (and polygons), so let’s get a bit more concrete. <code>transformr</code> currently recognises three data types: <em>polygons</em>, <em>paths</em>, and <em>simple features</em>. Polygons encompass simple polygons as well as polygons with any number of holes. Paths can be either single or multipaths. Simple features as implemented by the <code>sf</code> package are supported, currently covering the (multi)point, (multi)path, and (multi)polygon types.
</p>
<p>
In terms of tween type support, <code>transformr</code> currently extends the <code>tween_state()</code> API from <code>tweenr</code> but support for the other types of tweeners will be added with time.
</p>
</section>
<section id="some-examples" class="level2">
<h2 class="anchored" data-anchor-id="some-examples">
Some Examples
</h2>
<p>
At this point an example is probably in order. We’ll start with what we first identified as a problematic case: morphing between a circle and a star:
</p>
<pre class="r"><code>library(transformr)
library(ggplot2)

# Helpers included in transformer
circle &lt;- poly_circle()
star &lt;- poly_star()

# The data is a simple data.frame as you would feed into ggplot2
head(star)</code></pre>
<pre><code>##              x          y id
## 1 0.000000e+00  1.0000000  1
## 2 2.938926e-01  0.4045085  1
## 3 9.510565e-01  0.3090170  1
## 4 4.755283e-01 -0.1545085  1
## 5 5.877853e-01 -0.8090170  1
## 6 6.123234e-17 -0.5000000  1</code></pre>
<pre class="r"><code># We use tween_polygon to morph between the two
morph &lt;- tween_polygon(circle, star, 
                       ease = 'linear',
                       id = id,
                       nframes = 12)

# You get back a data.frame with the same special columns as with tweenr
head(morph)</code></pre>
<pre><code>##            x         y id .id .phase .frame
## 1 0.00000000 1.0000000  1   1    raw      1
## 2 0.01745241 0.9998477  1   1    raw      1
## 3 0.03489950 0.9993908  1   1    raw      1
## 4 0.05233596 0.9986295  1   1    raw      1
## 5 0.06975647 0.9975641  1   1    raw      1
## 6 0.08715574 0.9961947  1   1    raw      1</code></pre>
<pre class="r"><code># Let's see the result
ggplot(morph) + 
  geom_polygon(aes(x = x, y = y, group = id), fill = NA, colour = 'black') + 
  facet_wrap(~.frame, labeller = label_both, ncol = 3) + 
  theme_void()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-12-03-transformr-age-of-spatial_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<p>
What would happen if we upped the stakes a bit? Let’s try with a star with a hole, morphing into three circles:
</p>
<pre class="r"><code>circles &lt;- poly_circles()
star_hole &lt;- poly_star_hole()

morph &lt;- tween_polygon(circles, star_hole, 
                       ease = 'linear',
                       id = id, 
                       nframes = 12,
                       match = FALSE)

ggplot(morph) + 
  geom_polygon(aes(x = x, y = y, group = id), fill = NA, colour = 'black') + 
  facet_wrap(~.frame, labeller = label_both, ncol = 3) + 
  theme_void()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-12-03-transformr-age-of-spatial_files/figure-html/unnamed-chunk-3-1.png" width="672">
</p>
<p>
We introduced a new argument in <code>tween_polygon()</code> here. <code>match</code> is used to define whether polygons are matched by the value of <code>id</code> or whether all polygons in the first state should somehow morph into all polygons in the last state. If we set <code>match = TRUE</code>, we can use the <code>enter</code> and <code>exit</code> argument to define what should happen to unmatched polygons
</p>
<pre class="r"><code>morph &lt;- tween_polygon(circles, star_hole, 
                       ease = 'linear',
                       id = id, 
                       nframes = 12,
                       match = TRUE,
                       exit = function(.x) transform(.x, x = mean(x), y = mean(y)))

ggplot(morph) + 
  geom_polygon(aes(x = x, y = y, group = id), fill = NA, colour = 'black') + 
  facet_wrap(~.frame, labeller = label_both, ncol = 3) + 
  theme_void()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-12-03-transformr-age-of-spatial_files/figure-html/unnamed-chunk-4-1.png" width="672">
</p>
<p>
You’ll see a weird glitch above with the hole in the star reaching out to the edge, but this is simply <code>ggplot2</code> not knowing how to deal with holed polygons in <code>geom_polygon()</code> — I’ll handle that in another post…
</p>
<p>
What is not shown above is that <code>transformr</code> and <code>tween_polygon()</code> works well together with <code>keep_state()</code> from <code>tweenr</code> and that it is pipe-able, but if you are used to <code>tween_state()</code> this will all come natural…
</p>
<p>
While path and sf morphing works in much the same way as shown above, I’ll quickly show case it for completeness:
</p>
<pre class="r"><code>spiral &lt;- path_spiral()
waves &lt;- path_waves()

morph &lt;- tween_path(spiral, waves,
                    ease = 'linear',
                    nframes = 12, 
                    id = id,
                    match = FALSE)

ggplot(morph) + 
  geom_path(aes(x = x, y = y, group = id), colour = 'black') + 
  facet_wrap(~.frame, labeller = label_both, ncol = 3) + 
  theme_void()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-12-03-transformr-age-of-spatial_files/figure-html/unnamed-chunk-5-1.png" width="672">
</p>
<pre class="r"><code>circle_st &lt;- sf::st_sf(geometry = sf::st_sfc(poly_circle(st = TRUE)))
north_carolina &lt;- sf::st_read(system.file("shape/nc.shp", package = "sf"), 
                              quiet = TRUE)
north_carolina &lt;- st_normalize(sf::st_combine(north_carolina))
north_carolina &lt;- sf::st_sf(geometry = sf::st_sfc(north_carolina))

morph &lt;- tween_sf(north_carolina, circle_st,
                  ease = 'linear',
                  nframes = 12)

ggplot(morph) + 
  geom_sf(aes(geometry = geometry), colour = 'white', fill = 'black', size = .1) + 
  facet_wrap(~.frame, labeller = label_both, ncol = 3) + 
  coord_sf(datum = NULL) + 
  theme_void()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-12-03-transformr-age-of-spatial_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
<p>
As can be seen, <code>transformr</code> can handle most of the things you choose to to throw at it, when it comes to morphing between different shapes. It is used under the hood in <code>gganimate</code> to power polygon, path, and sf geom transitions (and derivatives thereof), but can just as well be used directly in the same way as <code>tweenr</code> can…
</p>
<p>
I do hope you’ll enjoy <code>transformr</code> either simply through the magic of <code>gganimate</code> or by playing with it directly — the results can be quite mesmerizing…
</p>
</section>



 ]]></description>
  <category>animation</category>
  <category>visualization</category>
  <category>transformr</category>
  <category>announcement</category>
  <category>package</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2018-12-03-transformr-age-of-spatial/</guid>
  <pubDate>Sun, 09 Dec 2018 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/transformr_logo_small.png" medium="image" type="image/png" height="144" width="144"/>
</item>
<item>
  <title>The tweenr is all grown up</title>
  <link>https://tutor-church-15580.netlify.app/posts/2018-09-26-the-tweenr-is-all-grown-up/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/tweenr_logo_small.png" align="right" style="width:50%;max-width:200px;margin-left:5pt">
</p>
<blockquote class="blockquote">
<p>
NOTE: tweenr was released some time ago but a theft of my computer while writing the release post meant that I only just finished writing about it now.
</p>
</blockquote>
<p>
I’m very happy to once again announce a package update, as a new major version of <code>tweenr</code> is now on CRAN. This release, while significant in itself, is also an important part of getting <code>gganimate</code> on CRAN, so if you care about <code>gganimate</code> but have never heard of <code>tweenr</code> you should be happy nonetheless.
</p>
<p>
<code>tweenr</code> was my first sort-of popular package and filled a gap in the gganimate version of yore, where smooth transitions were something you had to bring yourself. It has lived almost unchanged since its initial release, but as I began to develop the next iteration of <code>gganimate</code> it became clear that new functionality was needed. Some of it ended up in the <a href="https://github.com/thomasp85/transformr"><code>transformr</code></a> but a huge chunk has been added to tweenr itself. A description of everything new follows below.
</p>
<section id="something-new-something-old-new_old" class="level2">
<h2 class="anchored" data-anchor-id="something-new-something-old-new_old">
Something new, something old {new_old}
</h2>
<p>
The main API of the previous version of <code>tweenr</code> comprised of the <code>tween_states()</code>, <code>tween_elements()</code>, and <code>tween_appear()</code>. All of these needed serious change in both capabilities and API to the extend that all prior code would break, so instead I decided to keep the old functions unchanged and create new ones for a brighter future.
</p>
<section id="tween_states-tween_statekeep_state" class="level3">
<h3 class="anchored" data-anchor-id="tween_states-tween_statekeep_state">
tween_states ⇒ tween_state/keep_state
</h3>
<p>
<code>tween_states()</code> was perhaps the most used of the old functions. It takes a list of data.frames and a specification of how long transitions between them should take and how long it should pause at each data.frame. This is perfect for situations where you have discrete states and want to have a smooth transition between them. Still it had certain shortcomings, such as requiring that each data.frame included the same number of rows etc. (meaning that each “element” should be present in each state). Further I have ended up finding it a bit clumsy to use. The new function(s) that should serve the same needs as <code>tween_states()</code> is <code>tween_state()</code> (and <code>keep_state()</code>) and they are much more powerful. The biggest difference, perhaps, is that <code>tween_state()</code> only takes a <em>from</em> and <em>to</em> state and not an arbitrarily long list of states. If transitions are needed between more states you’ll need to chain calls together (potentially with <code>keep_state()</code> if the state should pause between transitions). It will something like this:
</p>
<pre class="r"><code>library(tweenr)
irises &lt;- split(iris, iris$Species)
iris_tween &lt;- irises$setosa %&gt;% 
  tween_state(irises$versicolor, ease = 'cubic-in-out', nframes = 10) %&gt;% 
  keep_state(5) %&gt;% 
  tween_state(irises$virginica, ease = 'linear', nframes = 15) %&gt;% 
  keep_state(5)</code></pre>
<p>
As can be seen, if you like piping you’ll feel right at home. Apart from the seemingly superficial change in API, the <code>tween_state()</code> also packs some new tricks. One of these is per-column easing, so you can specify different easing functions for different variables. Another, more fundamental one, is the possibility of specifying an id to match rows by. Ultimately this means that rows no longer needs to be matched by position and that you can now tween between states with different numbers of elements. All of this is so important that it will get its own section later on in Enter and Exit.
</p>
</section>
<section id="tween_elements-tween_components" class="level3">
<h3 class="anchored" data-anchor-id="tween_elements-tween_components">
tween_elements ⇒ tween_components
</h3>
<p>
While <code>tween_states()</code> was probably the most used of the legacy functions it was by no means the only one. <code>tween_elements()</code> was a very powerful function that let you specify different individual states for each element in a single data.frame and then expand this to an arbitrary number of frames. The changes that its hier bring is less dramatic than what happened with <code>tween_states()</code>, and simply adds the same features that <code>tween_state()</code> introduced. This means per-column easing and the same features as described below in Enter and Exit. Further it changes the semantics of a couple of variables so they are now tidy evaluated. This means that state specifications can be calculated on the fly, rather than having to exist inside the data.frame to be tweened.
</p>
</section>
<section id="tween_appear-tween_events" class="level3">
<h3 class="anchored" data-anchor-id="tween_appear-tween_events">
tween_appear ⇒ tween_events
</h3>
<p>
<code>tween_appear()</code> never really felt right. The purpose was to let each row appear in a specific frame while still allowing the user to define how it should appear. To this end it expanded the data.frame by giving each row an <em>age</em> in each frame (negative age meant that it had yet to appear) and then let the user do with this as they pleased. What was missing was the whole idea of Enter and Exit which I have already plugged multiple times. <code>tween_event()</code> is a pretty radical change in order to solve the problem that <code>tween_appear()</code> originally tried (but failed) to solve.
</p>
</section>
</section>
<section id="enter-and-exit" class="level2">
<h2 class="anchored" data-anchor-id="enter-and-exit">
Enter and Exit
</h2>
<p>
Before we go any further with other new tweening functions I think I owe it to you to describe what all this entering and exiting I’ve been talking about really is. If you have dabbled in D3.js the two words will be familiar but they have slightly different meaning in <code>tweenr</code>. In D3 enter and exit describe a selection of data that did not match in a data join between current and next state, while in tweenr it is a function that modifies data that is going to appear or disappear. If enter and/or exit is not given, the data will just pop into existence in the first frame it relates to and disappear without a trace after the last frame it relates to has ended. if you provide e.g.&nbsp;an enter function this will be applied to all elements when they first appear. The result of the function will then be inserted into the tweening prior to the original data so that any changes the function does will gradually change to the original data. This may sound quite confusing, but in essence it means that if you pass in an enter function that sets the transparancy variable to zero you’ll get a gradual fade-in effect. The exit function is just like it, but in reverse. But let’s see it in effect instead:
</p>
<pre class="r"><code>df1 &lt;- data.frame(x = 1:2, y = 2, alpha = 1) # 2 rows
df2 &lt;- data.frame(x = 2:0, y = 1, alpha = 1) # 3 rows

fade &lt;- function(data) {
  data$alpha &lt;- 0
  data
}

tween &lt;- tween_state(df1, df2, ease = 'linear', nframes = 5, enter = fade)

tween$alpha[tween$.id == 3]</code></pre>
<pre><code>## [1] 0.25 0.50 0.75 1.00</code></pre>
<p>
We can see that the alpha value of the third element (the one that doesn’t exist in the first state) gradually increase from zero to one. The reason why 0 isn’t included is that the enter and exit function return virtual states that doens’t remain in the data - only the transition to and from them does.
</p>
</section>
<section id="new-tweens-on-the-block" class="level2">
<h2 class="anchored" data-anchor-id="new-tweens-on-the-block">
New tweens on the block
</h2>
<p>
Appart from the new versions of the old functionality discussed above this version also includes some brand new tweens, mainly implemented to serve needs in <code>gganimate</code> but of course also available for everyone else.
</p>
<section id="tween_along" class="level3">
<h3 class="anchored" data-anchor-id="tween_along">
tween_along
</h3>
<p>
This is the tweening function that powers <code>transition_reveal()</code> in <code>gganimate</code> - if you have played with that you are fairly well situated to understand what it does. In essence it allows you to specify time points for the different rows in your data.frame and then tween between these. You might think that this is exactly what <code>tween_components()</code> does and you’d partly be right. The big difference is that <code>tween_along()</code> ensures equidistant frames, whereas <code>tween_components()</code> assigns all rows in the data to the nearest frame and then use the frame as a time variable. The latter will always have the raw data appearing in one frame or another while the former will not. Further, <code>tween_along()</code> will optionally keep earlier rows in your data at each frame, which is useful if you e.g.&nbsp;want a line to gradually appear along an axis.
</p>
</section>
<section id="tween_at" class="level3">
<h3 class="anchored" data-anchor-id="tween_at">
tween_at
</h3>
<p>
This is a pretty low level tweening function intened to get an exact state between two data frames or vectors. It takes two states and a numeric vector giving the tween position for each row and then calculates the intermediary rows.
</p>
<pre class="r"><code>tween_at(mtcars[1:3, ], mtcars[4:6, ], runif(3), 'linear')</code></pre>
<pre><code>##        mpg      cyl     disp       hp     drat       wt     qsec        vs
## 1 21.27039 6.000000 226.2455 110.0000 3.345701 3.022205 18.47440 0.6759747
## 2 20.51284 6.423621 202.3621 123.7677 3.741142 2.994673 17.02000 0.0000000
## 3 19.77526 5.287121 183.2966 100.7227 3.148519 3.053659 19.64613 1.0000000
##          am     gear     carb
## 1 0.3240253 3.324025 1.972076
## 2 0.7881894 3.788189 3.576379
## 3 0.3564393 3.356439 1.000000</code></pre>
<p>
This is unlikely to be useful for directly creating animations (though I guess it is low-level enough to be able to be shoehorned into anything). It is being used in <code>gganimate</code> for calculating shadow falloff in <code>shadow_wake()</code>.
</p>
</section>
<section id="tween_fill" class="level3">
<h3 class="anchored" data-anchor-id="tween_fill">
tween_fill
</h3>
<p>
This tween takes a page out of <code>tidyr::fill</code> and simply fill out missing elements in a data frame or vector. Instead of being boring and repeating the prior or following data it doesn the <code>tweenr</code> thing and tweens between them:
</p>
<pre class="r"><code>mtcars2 &lt;- mtcars[1:7, ]
mtcars2[2:6, ] &lt;- NA

tween_fill(mtcars2, 'cubic-in-out')</code></pre>
<pre><code>##        mpg      cyl     disp    hp     drat       wt     qsec vs
## 1 21.00000 6.000000 160.0000 110.0 3.900000 2.620000 16.46000  0
## 2 20.87593 6.037037 163.7037 112.5 3.887222 2.637593 16.44852  0
## 3 20.00741 6.296296 189.6296 130.0 3.797778 2.760741 16.36815  0
## 4 17.65000 7.000000 260.0000 177.5 3.555000 3.095000 16.15000  0
## 5 15.29259 7.703704 330.3704 225.0 3.312222 3.429259 15.93185  0
## 6 14.42407 7.962963 356.2963 242.5 3.222778 3.552407 15.85148  0
## 7 14.30000 8.000000 360.0000 245.0 3.210000 3.570000 15.84000  0
##           am     gear carb
## 1 1.00000000 4.000000    4
## 2 0.98148148 3.981481    4
## 3 0.85185185 3.851852    4
## 4 0.50000000 3.500000    4
## 5 0.14814815 3.148148    4
## 6 0.01851852 3.018519    4
## 7 0.00000000 3.000000    4</code></pre>
<p>
Neat-o…
</p>
</section>
</section>
<section id="grab-bag-of-niceties" class="level2">
<h2 class="anchored" data-anchor-id="grab-bag-of-niceties">
Grab bag of niceties
</h2>
<p>
There are more subtle additions to <code>tweenr</code> as well, most of which has also been driven by <code>gganimate</code> needs. Here’s an unceremonious list:
</p>
<ul>
<li>
<em>More information:</em> In olden days <code>tweenr</code> simply added a <code>.frame</code> column to the output to identify the frame the row belonged to. It still does, but that column is now accompagnied by a <code>.phase</code> column that tells if the data is raw, static, entering, exiting, or transitioning, and an <code>.id</code> column that identifies the same data across frames.
</li>
<li>
<em>More support:</em> The supported data types has been expanded considerably. Most notably list columns are now accepted. If the list only contains numeric vectors these vectors will get tweened accordingly, and if not the list will be treated as constant.
</li>
<li>
<em>Better colour support:</em> Colour has always been tweened in the LAB representation to get a more natural transition. In the old version the native <code>convertColor()</code> function was used, but this could lead to substantial slowdown when tweening lots of data. To address this <code>farver</code> was developed and released and this version of <code>tweenr</code> naturally uses this for colour space conversions now. In addition, <code>tweenr</code> now supports hex-colours with alpha.
</li>
</ul>
<p>
I think that is it, but frankly there has been so many additions that I may have missed a few… First to find something I missed gets a sticker!
</p>
</section>



 ]]></description>
  <category>tweenr</category>
  <category>announcement</category>
  <category>package</category>
  <category>animation</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2018-09-26-the-tweenr-is-all-grown-up/</guid>
  <pubDate>Mon, 22 Oct 2018 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/tweenr_logo_small.png" medium="image" type="image/png" height="166" width="144"/>
</item>
<item>
  <title>What Are We Plotting, What Are We Animating</title>
  <link>https://tutor-church-15580.netlify.app/posts/2018-09-22-what-are-we-plotting-what-are-we-animating/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/gganimate_logo_small.png" align="right" style="width:50%;max-width:200px;margin-left:5pt">
</p>
<p>
This is my first blog post about <code>gganimate</code> — a package I’ve been working on since mid-spring this year. I have many thoughts and lots to say about animation and <code>gganimate</code>, so much in fact that it has seemed too big a task to begin writing about. Further, I felt like I had to spend my time developing the thing in the first place.
</p>
<p>
So this is an alternative entrance into writing about <code>gganimate</code> — sort of a tech-note about a specific problem. There will still come a time for some more formal writing about the theory and use of <code>gganimate</code> but until then I’ll refer to my <a href="https://youtu.be/21ZWDrTukEs">useR keynote</a> for any words on my thoughts behind it all.
</p>
<section id="the-problem" class="level2">
<h2 class="anchored" data-anchor-id="the-problem">
The Problem
</h2>
<p>
When we animate data visualisations we often do it by calculating intermediary data points resulting in a smooth transition between the states represented by the raw data. In <code>gganimate</code> this is done by adding a <em>transition</em> which defines how data should be expanded across the animation frames. Underneath it all most transitions calculate intermediary data representations using <code>tweenr</code> and <code>transformr</code> — so far, so good.
</p>
<p>
What we have glanced over, and what is at the center of the problem, is what state of the data we decide to use as basis for our expansion. If you are not familiar with <code>ggplot2</code> and the grammar of graphics this might be a strange phrasing — data is data — but if you are, you’ll know that data can undergo several statistical transformations before it is encoded into a visual property and put on paper (or screen). Some of the states the data undergo are:
</p>
<ol style="list-style-type: decimal">
<li>
Raw data as it is passed into the plotting function
</li>
<li>
Raw data with only the columns mapped to aesthetics present
</li>
<li>
Data transformed by a statistic
</li>
<li>
Data with aesthetics mapped to a scale
</li>
<li>
Data with default aesthetic values added
</li>
<li>
Data transformed by the geom
</li>
</ol>
<p>
If you prepare your data for animation beforehand (e.g.&nbsp;using <code>tweenr</code>), you’re only able to touch the data at the first state and thus limited in what you can do. If there is a one-to-one mapping between the raw data and the final visual encoding this might not be a problem, but it breaks down spectacularly when the statistic transformation impose a grouping of the data into a shared visual encoding, e.g.&nbsp;a box-plot. Consider the task of calculating intermediary data for a transition from one box-plot showing statistics for 10 points, to another box-plot showing statistics for 15 points. If you could only use the raw data your atomic observations would suddenly have to change from 10 to 15 values in a smooth manner. On the other hand, if you could calculate the statistics used to draw the two box-plots and then calculate intermediary statistics instead, this discrepancy in the underlying data would not pose any problem. Indeed, the latter approach is what is done in <code>gganimate</code> — all data expansion is performed after statistics have been calculated. In fact, all expansion is done when data has reached state 5. Why wait so long? A simple example to explain this is the case of colour (or fill) aesthetics. If they are mapped to a categorical variable there will be no way to create a smooth transition based on the raw data. On the other hand, if we wait until the raw data has been mapped to its final colour value, we may smoothly transition the colour itself, ignoring the fact that the intermediary colours does not correspond to any meaningful category in the raw data.
</p>
</section>
<section id="the-curious-case-of-tesselation" class="level2">
<h2 class="anchored" data-anchor-id="the-curious-case-of-tesselation">
The Curious Case of Tesselation
</h2>
<p>
So, “what is the problem?”, you may ask. Indeed, this approach is almost universally good, to the extend that you might just ignore the existence of other approaches… But the devils in the detail — let’s make a plot:
</p>
<pre class="r"><code>library(ggplot2)
library(ggforce)

data &lt;- data.frame(
  x = runif(20),
  y = runif(20),
  state = rep(c('a', 'b'), 10)
)

ggplot(data, aes(x = x, y = y)) + 
  geom_voronoi_tile(fill = 'grey', colour = 'black', bound = c(0, 1, 0, 1)) + 
  geom_point() + 
  facet_wrap(~state)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-09-22-what-are-we-plotting-what-are-we-animating_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<p>
Now, think about what you would expect a transition between the two panels to look like - my guess is that it is nothing like below:
</p>
<pre class="r"><code>library(gganimate)
ggplot(data, aes(x = x, y = y)) + 
  geom_voronoi_tile(fill = 'grey', colour = 'black', bound = c(0, 1, 0, 1)) + 
  geom_point() + 
  transition_states(state, transition_length = 3, state_length = 1) + 
  ease_aes('cubic-in-out')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-09-22-what-are-we-plotting-what-are-we-animating_files/figure-html/unnamed-chunk-3-1.gif"><!-- -->
</p>
<p>
Okay, what is going on? To be honest I had a different expectation about how this would fail when I started writing this. The reason why the voronoi tiles are static (and calculated based on all the points) is that the voronoi tessellation is calculated on the full panel data. At the time the voronoi tile statistic receives the data it all just belongs to the same panel since <code>gganimate</code> differentiate states using the group aesthetics. To show you how I expected this example to break down we’ll have to tell the voronoi stat to tessellate based on the groups instead:
</p>
<pre class="r"><code>ggplot(data, aes(x = x, y = y)) + 
  geom_voronoi_tile(fill = 'grey', colour = 'black', bound = c(0, 1, 0, 1),
                    by.group = TRUE) + 
  geom_point() + 
  transition_states(state, transition_length = 3, state_length = 1) + 
  ease_aes('cubic-in-out')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-09-22-what-are-we-plotting-what-are-we-animating_files/figure-html/unnamed-chunk-4-1.gif"><!-- -->
</p>
<p>
Now, at least it is wrong in the way that I expected it to be. Why is this wrong? The tessellation stat outputs polygon data that is then drawn by a polygon geom, so <code>gganimate</code> does the best it can to transition these polygons smoothly between the states. In this example this is not what we expected though. We expect a tessellation to always be true, even during the transition so the tessellation should be calculated for each frame, based on intermediary point positions. In other words, here we want the expansion to happen on the raw data.
</p>
<pre class="r"><code>library(tweenr)
library(magrittr)
data &lt;- split(data, data$state)

data &lt;- tween_state(data[[1]], data[[2]], 'cubic-in-out', 40) %&gt;% 
  keep_state(10) %&gt;% 
  tween_state(data[[1]],'cubic-in-out', 40) %&gt;% 
  keep_state(10)

ggplot(data, aes(x = x, y = y)) + 
  geom_voronoi_tile(fill = 'grey', colour = 'black', bound = c(0, 1, 0, 1),
                    by.group = TRUE) + 
  geom_point() + 
  transition_manual(.frame)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-09-22-what-are-we-plotting-what-are-we-animating_files/figure-html/unnamed-chunk-5-1.gif"><!-- -->
</p>
<p>
Ah, we have finally arrived at the expected animation, but what a mess of a journey.
</p>
</section>
<section id="who-plots-tesselation-anyway" class="level2">
<h2 class="anchored" data-anchor-id="who-plots-tesselation-anyway">
Who Plots Tesselation Anyway?
</h2>
<p>
You may think the above example is laughably construed — this may even be the first time you’ve heard of voronoi tessellation. Hold my beer, because it is about to get even worse, even using a geom from <code>ggplot2</code> itself. We’ll start with a plot again:
</p>
<pre class="r"><code>data &lt;- data.frame(
  x = c(rnorm(50, mean = 5, sd = 3), rnorm(40, mean = 2, sd = 1)),
  y = c(rnorm(50, mean = -2, sd = 7), rnorm(40, mean = 6, sd = 4)),
  state = rep(c('a', 'b'), c(50, 40))
)

ggplot(data, aes(x = x, y = y)) +
  geom_contour(stat = 'density_2d') + 
  facet_wrap(~state)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-09-22-what-are-we-plotting-what-are-we-animating_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
<p>
And how might this look if we transition between <em>a</em> and <em>b</em>?
</p>
<pre class="r"><code>ggplot(data, aes(x = x, y = y)) +
  geom_contour(stat = 'density_2d') + 
  transition_states(state, transition_length = 3, state_length = 1) + 
  ease_aes('cubic-in-out')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-09-22-what-are-we-plotting-what-are-we-animating_files/figure-html/unnamed-chunk-7-1.gif"><!-- -->
</p>
<p>
Oh my… The problem is more or less the same as with the tessellation - the stat creates a <em>primitive</em> data representation (here paths and not polygons) and <code>gganimate</code> does its best at transitioning those, but in doing this the intermediary frames does not resemble contour lines at all, but more a bowl of spaghetti.
</p>
<p>
So, could we fix it in the same way? Just prepare the data beforehand. Well, not really as we run into the first problem discussed, way up at the beginning of the blog. There is really no meaningful way of transitioning 50 points into 40. We could remove 10 and move the remaining 40, but in terms of the derived density this would look messy (but let’s try anyway):
</p>
<pre class="r"><code>data2 &lt;- split(data, data$state)
data2 &lt;- tween_state(data2[[1]], data2[[2]], 'cubic-in-out', 40) %&gt;% 
  keep_state(10) %&gt;% 
  tween_state(data2[[1]], 'cubic-in-out', 40) %&gt;% 
  keep_state(10)

ggplot(data2, aes(x = x, y = y)) +
  geom_contour(stat = 'density_2d') + 
  transition_manual(.frame)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-09-22-what-are-we-plotting-what-are-we-animating_files/figure-html/unnamed-chunk-8-1.gif"><!-- -->
</p>
<p>
It sort of does the right thing, but there is a noticeable switch in the density as the 10 points disappears and reappears.
</p>
<p>
What we really want to do is to calculate intermediary states of the 2D densities that the contours are derived from. The densities remove the point discrepancy while presenting a statistic that can be truthfully transitioned. Unfortunately the density data is only present ephemerally inside the stat function and is not accessible to the outside world (where <code>gganimate</code> resides). We could rewrite the density_2d stat to wait with the contour transformation:
</p>
<pre class="r"><code>StatDensityContour &lt;- ggproto('StatDensityContour', StatDensity2d,
  compute_group = function (data, scales, na.rm = FALSE, h = NULL, contour = TRUE, 
                            n = 100, bins = NULL, binwidth = NULL) {
    StatDensity2d$compute_group(data, scales, na.rm = na.rm, h = h, contour = FALSE, 
                                n = n, bins = bins, binwidth = binwidth)
  },
  finish_layer = function(self, data, params) {
    names(data)[names(data) == 'density'] &lt;- 'z'
    do.call(rbind, lapply(split(data, data$PANEL), function(d) {
      StatContour$compute_panel(d, scales = NULL, bins = params$bins, 
                                binwidth = params$binwidth)
    }))
  }
)

ggplot(data, aes(x = x, y = y)) +
  geom_contour(stat = 'density_contour') + 
  transition_states(state, transition_length = 3, state_length = 1) + 
  ease_aes('cubic-in-out')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-09-22-what-are-we-plotting-what-are-we-animating_files/figure-html/unnamed-chunk-9-1.gif"><!-- -->
</p>
</section>
<section id="what-to-make-of-this" class="level2">
<h2 class="anchored" data-anchor-id="what-to-make-of-this">
What to make of this?
</h2>
<p>
You might feel like Alice who has stepped through the looking glass at this point. Should you always second guess whatever <code>gganimate</code> is doing? Of course not. The choice of interpolating the statistically transformed data is sound and will <em>just work</em> for most of what you want to do. I certainly want to allow gganimate to expand based on the raw data as well, though this has proven harder than expected as it is often only a subset of aesthetics you want to expand at that state (remember the problem with unmapped colour/fill).
</p>
<p>
Even if <em>early expansion</em> gets implemented it will only solve problems such as the voronoi example. The last contour example runs deeper and touches upon the theory of the grammar of graphics and how <code>ggplot2</code> implements it itself. Statistical transformations are often envisioned as a single operation, but can just as well be thought of as a chain of transformation (here density_2d -&gt; contour). Alternatively one could think that it was the responsibility of the geom to calculate the contour lines. All-in-all the dichotomy of stat+geom is not so clear cut as it might appear, which has not been much of a problem when generating static plots. With the advent of <code>gganimate</code> this problem becomes more pertinent and I honestly don’t know the best way to address it. In a perfect world, all stats would return the data-state best fitted for expansion but this would require the <code>finish_layer()</code> hook to be more powerful, and would obviously require rewrites of a slew of geoms/stats. Then comes the question of whether it is even the responsibility of geom/stat developers to consider <code>gganimate</code> in the first place…
</p>
<p>
No matter the eventual solution to all this, I hope this post has made you a bit more aware of what happens to the data you plot as you passed it into <code>ggplot2</code>. Visualisations are after all first and foremost about data transformations…
</p>
</section>



 ]]></description>
  <category>animation</category>
  <category>gganimate</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2018-09-22-what-are-we-plotting-what-are-we-animating/</guid>
  <pubDate>Mon, 24 Sep 2018 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/gganimate_logo.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>Scico and the Colour Conundrum</title>
  <link>https://tutor-church-15580.netlify.app/posts/2018-05-30-scico-and-the-colour-conundrum/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/scico_logo_small.png" align="right" style="width:50%;max-width:200px;">
</p>
<p>
I’m happy to once again announce the release of a package. This time it happens to be a rather unplanned and quick new package, which is fun for a change. The package in question is <code>scico</code> which provides access to the <a href="http://www.fabiocrameri.ch/colourmaps.php">colour palettes developed by Fabio Crameri</a> as well as scale functions for <code>ggplot2</code> so they can be used there. As there is not a lot to talk about in such a simple package I’ll also spend some time discussing why choice of colour is important beyond aesthtic considerations, and discuss how the quality of a palette might be assesed.
</p>
<section id="an-overview-of-the-package" class="level2">
<h2 class="anchored" data-anchor-id="an-overview-of-the-package">
An overview of the package
</h2>
<p>
<code>scico</code> provides a total of 17 different continuous palettes, all of which are available with the <code>scico()</code> function. For anyone having used <code>viridis()</code> the <code>scico()</code> API is very familiar:
</p>
<pre class="r"><code>library(scico)
scico(15, palette = 'oslo')</code></pre>
<pre><code>##  [1] "#000000" "#09131E" "#0C2236" "#133352" "#19456F" "#24588E" "#3569AC"
##  [8] "#4D7CC6" "#668CCB" "#7C99CA" "#94A8C9" "#ABB6C7" "#C4C7CC" "#E1E1E1"
## [15] "#FFFFFF"</code></pre>
<p>
In order to get a quick overview of all the available palettes use the <code>scico_palette_show()</code> function:
</p>
<pre class="r"><code>scico_palette_show()</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-05-30-scico-and-the-colour-conundrum_files/figure-html/unnamed-chunk-2-1.png" width="672">
</p>
<p>
As can be seen, the collection consists of both sequential and diverging palettes, both of which have their own uses depending on the data you want to show. A special mention goes to the <code>oleron</code> palette which is intended for topographical height data in order to produce the well known <em>atlas look</em>. Be sure to center this palette around 0 or else you will end up with very misleading maps.
</p>
<p>
<code>ggplot2</code> support is provided with the <code>scale_[colour|color|fill]_scico()</code> functions, which works as expected:
</p>
<pre class="r"><code>library(ggplot2)
volcano &lt;- data.frame(
  x = rep(seq_len(ncol(volcano)), each = nrow(volcano)),
  y = rep(seq_len(nrow(volcano)), ncol(volcano)),
  Altitude = as.vector(volcano)
)

ggplot(volcano, aes(x = x, y = y, fill = Altitude)) + 
  geom_raster() + 
  theme_void() +
  scale_fill_scico(palette = 'turku') </code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-05-30-scico-and-the-colour-conundrum_files/figure-html/unnamed-chunk-3-1.png" width="672">
</p>
<p>
This is more or less all there is for this package… Now, let’s get to the meat of the discussion.
</p>
</section>
<section id="whats-in-a-colour" class="level2">
<h2 class="anchored" data-anchor-id="whats-in-a-colour">
What’s in a colour?
</h2>
<p>
If you’ve ever wondered why we are not all just using rainbow colours in our plots (after all, rainbows <em>are</em> pretty…) it’s because our choice of colour scale have a deep impact on what changes in the underlying data our eyes can percieve. The rainbow colour scale is still very common and notoriously bad - see e.g.&nbsp;<a href="https://ieeexplore.ieee.org/document/4118486/">Borland &amp; Taylor (2007)</a>, <a href="https://blogs.egu.eu/divisions/gd/2017/08/23/the-rainbow-colour-map/"><em>The Rainbow Colour Map (repeatedly) considered harmful</em></a>, and <a href="https://eagereyes.org/basics/rainbow-color-map"><em>How The Rainbow Color Map Misleads</em></a> - due to two huge problems that are fundamental to designing good colour scales: <em>Perceptual Uniformity</em> and <em>Colour Blind Safe</em>. Both of these issues have been taken into account when designing the <code>scico</code> palettes, but let’s tackle them one by one:
</p>
<section id="colour-blindness" class="level3">
<h3 class="anchored" data-anchor-id="colour-blindness">
Colour blindness
</h3>
<p>
Up to 10% of north european males have the most common type of colour blindness (deuteranomaly, also known as red-green colour blindness), while the number is lower for other population groups. In addition, other, rarer, types of colour blindness exists as well. In any case, the chance that a person with a color vision deficiency will look at your plots is pretty high.
</p>
<p>
As we have to assume that the plots we produce will be looked at by people with color vision deficiency, we must make sure that the colours we use to encode data can be clearly read by them (ornamental colours are less important as they - hopefully - don’t impact the conclusion of the graphic). Thanksfully there are ways to simulate how colours are percieved by people with various types of colour blindness. Let’s look at the rainbow colour map:
</p>
<pre class="r"><code>library(pals)

pal.safe(rainbow, main = 'Rainbow scale')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-05-30-scico-and-the-colour-conundrum_files/figure-html/unnamed-chunk-4-1.png" width="672">
</p>
<p>
As can be seen, there are huge areas of the scale where key tints disappears, making it impossible to correctly map colours back to their original data values. Put this in contrast to one of the <code>scico</code> palettes:
</p>
<pre class="r"><code>pal.safe(scico(100, palette = 'tokyo'), main = 'Tokyo scale')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-05-30-scico-and-the-colour-conundrum_files/figure-html/unnamed-chunk-5-1.png" width="672">
</p>
<p>
While colour blindness certainly have an effect here, it is less detrimental as changes along the scale can still be percieved and the same tint is not occuring at multiple places.
</p>
</section>
<section id="perceptual-uniformity" class="level3">
<h3 class="anchored" data-anchor-id="perceptual-uniformity">
Perceptual uniformity
</h3>
<p>
While lack of colour blind safety “only” affects a subgroup of your audience, lack of perceptual uniformity affects everyone - even you. Behind the slightly highbrow name lies the criteria that equal jumps in the underlying data should result in equal jumps in percieved colour difference. Said in another way, every step along the palette should be percieved as giving the same amount of difference in colour.
</p>
<p>
One way to assess perceptual uniformity is by looking at small oscillations inside the scale. Let’s return to our favourite worst rainbow scale:
</p>
<pre class="r"><code>pal.sineramp(rainbow, main = 'Rainbow scale')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-05-30-scico-and-the-colour-conundrum_files/figure-html/unnamed-chunk-6-1.png" width="672">
</p>
<p>
We can see that there are huge differences in how clearly the oscilations appear along the scale and around the green area they even disappears. In comparison the <code>scico</code> palettes produces much more even resuls:
</p>
<pre class="r"><code>pal.sineramp(scico(100, palette = 'tokyo'), main = 'Tokyo scale')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-05-30-scico-and-the-colour-conundrum_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
</section>
<section id="but-wait---theres-more" class="level3">
<h3 class="anchored" data-anchor-id="but-wait---theres-more">
But wait - there’s more!
</h3>
<p>
This is just a very short overview into the world of colour perception and how it affects information visualisation. The <code>pals</code> package contains more functions to assess the quality of colour palettes, some of which has been collected in an ensemble function:
</p>
<pre class="r"><code>pal.test(scico(100, palette = 'broc'), main = 'Broc scale')</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-05-30-scico-and-the-colour-conundrum_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
<p>
It also has <a href="https://cran.r-project.org/web/packages/pals/vignettes/pals_examples.html">a vignette</a> that explains in more detail how the different plots can be used to look into different aspects of the palette.
</p>
<p>
<code>scico</code> is also not the only package that provides well-designed, safe, colour palettes. <a href="https://cran.r-project.org/web/packages/RColorBrewer/index.html"><code>RColorBrewer</code></a> has been a beloved utility for a long time, as well as the more recent <a href="https://cran.r-project.org/web/packages/viridis/index.html"><code>viridis</code></a>. Still, choice is good and using the same palettes for prolonged time can make them seem old and boring, so the more the merrier.
</p>
<p>
A last honerable mention is <a href="https://github.com/EmilHvitfeldt/r-color-palettes/blob/master/README.md">the overview of palettes in R</a> that <a href="https://twitter.com/Emil_Hvitfeldt">Emil Hvitfeldt</a> has put together. Not all of the palettes in it (the lions share actually) have been designed with the issues discussed above in mind, but sometimes thats OK - at least you now know how to assess the impact of your choice and weigh it out with the other considerations you have.
</p>
<p>
<em>Always be weary of colours</em>
</p>
</section>
</section>



 ]]></description>
  <category>announcement</category>
  <category>visualization</category>
  <category>colour</category>
  <category>color</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2018-05-30-scico-and-the-colour-conundrum/</guid>
  <pubDate>Wed, 30 May 2018 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/scico_announce.png" medium="image" type="image/png" height="72" width="144"/>
</item>
<item>
  <title>lime v0.4: The kitten picture edition</title>
  <link>https://tutor-church-15580.netlify.app/posts/2018-03-06-lime-v0-4-the-kitten-picture-edition/</link>
  <description><![CDATA[ 




<p>
<img src="https://tutor-church-15580.netlify.app/assets/img/lime_logo_small.jpg" align="right" style="width:50%;max-width:200px">
</p>
<p>
I’m happy to report a new major release of <code>lime</code> has landed on CRAN. <code>lime</code> is an R port of the Python library of the same name by Marco Ribeiro that allows the user to pry open black box machine learning models and explain their outcomes on a per-observation basis. It works by modelling the outcome of the black box in the local neighborhood around the observation to explain and using this local model to explain why (not how) the black box did what it did. For more information about the theory of <code>lime</code> I will direct you to the article <a href="https://arxiv.org/abs/1602.04938">introducing the methodology</a>.
</p>
<section id="new-features" class="level2">
<h2 class="anchored" data-anchor-id="new-features">
New features
</h2>
<p>
The meat of this release centers around two new features that are somewhat linked: Native support for keras models and support for explaining image models.
</p>
<section id="keras-and-images" class="level3">
<h3 class="anchored" data-anchor-id="keras-and-images">
keras and images
</h3>
<p>
J.J. Allaire was kind enough to namedrop <code>lime</code> during his keynote introduction of the <code>tensorflow</code> and <code>keras</code> packages and I felt compelled to support them natively. As keras is by far the most popular way to interface with tensorflow it is first in line for build-in support. The addition of keras means that <code>lime</code> now directly supports models from the following packages:
</p>
<ul>
<li>
<a href="https://github.com/topepo/caret">caret</a>
</li>
<li>
<a href="https://github.com/mlr-org/mlr">mlr</a>
</li>
<li>
<a href="https://github.com/dmlc/xgboost">xgboost</a>
</li>
<li>
<a href="https://github.com/h2oai/h2o-3">h2o</a>
</li>
<li>
<a href="https://github.com/rstudio/keras">keras</a>
</li>
</ul>
<p>
If you’re working on something too obscure or cutting edge to not be able to use these packages it is still possible to make your model <code>lime</code> compliant by providing <code>predict_model()</code> and <code>model_type()</code> methods for it.
</p>
<p>
keras models are used just like any other model, by passing it into the <code>lime()</code> function along with the training data in order to create an explainer object. Because we’re soon going to talk about image models, we’ll be using one of the pre-trained ImageNet models that is available from keras itself:
</p>
<pre class="r"><code>library(keras)
library(lime)
library(magick)

model &lt;- application_vgg16(
  weights = "imagenet",
  include_top = TRUE
)
model</code></pre>
<pre><code>## Model
## ___________________________________________________________________________
## Layer (type)                     Output Shape                  Param #     
## ===========================================================================
## input_1 (InputLayer)             (None, 224, 224, 3)           0           
## ___________________________________________________________________________
## block1_conv1 (Conv2D)            (None, 224, 224, 64)          1792        
## ___________________________________________________________________________
## block1_conv2 (Conv2D)            (None, 224, 224, 64)          36928       
## ___________________________________________________________________________
## block1_pool (MaxPooling2D)       (None, 112, 112, 64)          0           
## ___________________________________________________________________________
## block2_conv1 (Conv2D)            (None, 112, 112, 128)         73856       
## ___________________________________________________________________________
## block2_conv2 (Conv2D)            (None, 112, 112, 128)         147584      
## ___________________________________________________________________________
## block2_pool (MaxPooling2D)       (None, 56, 56, 128)           0           
## ___________________________________________________________________________
## block3_conv1 (Conv2D)            (None, 56, 56, 256)           295168      
## ___________________________________________________________________________
## block3_conv2 (Conv2D)            (None, 56, 56, 256)           590080      
## ___________________________________________________________________________
## block3_conv3 (Conv2D)            (None, 56, 56, 256)           590080      
## ___________________________________________________________________________
## block3_pool (MaxPooling2D)       (None, 28, 28, 256)           0           
## ___________________________________________________________________________
## block4_conv1 (Conv2D)            (None, 28, 28, 512)           1180160     
## ___________________________________________________________________________
## block4_conv2 (Conv2D)            (None, 28, 28, 512)           2359808     
## ___________________________________________________________________________
## block4_conv3 (Conv2D)            (None, 28, 28, 512)           2359808     
## ___________________________________________________________________________
## block4_pool (MaxPooling2D)       (None, 14, 14, 512)           0           
## ___________________________________________________________________________
## block5_conv1 (Conv2D)            (None, 14, 14, 512)           2359808     
## ___________________________________________________________________________
## block5_conv2 (Conv2D)            (None, 14, 14, 512)           2359808     
## ___________________________________________________________________________
## block5_conv3 (Conv2D)            (None, 14, 14, 512)           2359808     
## ___________________________________________________________________________
## block5_pool (MaxPooling2D)       (None, 7, 7, 512)             0           
## ___________________________________________________________________________
## flatten (Flatten)                (None, 25088)                 0           
## ___________________________________________________________________________
## fc1 (Dense)                      (None, 4096)                  102764544   
## ___________________________________________________________________________
## fc2 (Dense)                      (None, 4096)                  16781312    
## ___________________________________________________________________________
## predictions (Dense)              (None, 1000)                  4097000     
## ===========================================================================
## Total params: 138,357,544
## Trainable params: 138,357,544
## Non-trainable params: 0
## ___________________________________________________________________________</code></pre>
<p>
The vgg16 model is an image classification model that has been build as part of the ImageNet competition where the goal is to classify pictures into 1000 categories with the highest accuracy. As we can see it is fairly complicated.
</p>
<p>
In order to create an explainer we will need to pass in the training data as well. For image data the training data is really only used to tell lime that we are dealing with an image model, so any image will suffice. The format for the training data is simply the path to the images, and because the internet runs on kitten pictures we’ll use one of these:
</p>
<pre class="r"><code>img &lt;- image_read('https://www.data-imaginist.com/assets/img/kitten.jpg')
img_path &lt;- file.path(tempdir(), 'kitten.jpg')
image_write(img, img_path)
plot(as.raster(img))</code></pre>
<div class="figure">
<p><img src="https://tutor-church-15580.netlify.app/post/2018-03-06-lime-v0-4-the-kitten-picture-edition_files/figure-html/unnamed-chunk-2-1.png" alt="Photo by Paul on Unsplash" width="672"></p>
<p class="caption">
</p><p>Figure 1: Photo by Paul on Unsplash</p>
<p></p>
</div>
<p>
As with text models the explainer will need to know how to prepare the input data for the model. For keras models this means formatting the image data as tensors. Thankfully keras comes with a lot of tools for reshaping image data:
</p>
<pre class="r"><code>image_prep &lt;- function(x) {
  arrays &lt;- lapply(x, function(path) {
    img &lt;- image_load(path, target_size = c(224,224))
    x &lt;- image_to_array(img)
    x &lt;- array_reshape(x, c(1, dim(x)))
    x &lt;- imagenet_preprocess_input(x)
  })
  do.call(abind::abind, c(arrays, list(along = 1)))
}
explainer &lt;- lime(img_path, model, image_prep)</code></pre>
<p>
We now have an explainer model for understanding how the vgg16 neural network makes its predictions. Before we go along, lets see what the model think of our kitten:
</p>
<pre class="r"><code>res &lt;- predict(model, image_prep(img_path))
imagenet_decode_predictions(res)</code></pre>
<pre><code>## [[1]]
##   class_name class_description      score
## 1  n02124075      Egyptian_cat 0.48913878
## 2  n02123045             tabby 0.15177219
## 3  n02123159         tiger_cat 0.10270492
## 4  n02127052              lynx 0.02638111
## 5  n03793489             mouse 0.00852214</code></pre>
<p>
So, it is pretty sure about the whole cat thing. The reason we need to use <code>imagenet_decode_predictions()</code> is that the output of a keras model is always just a nameless tensor:
</p>
<pre class="r"><code>dim(res)</code></pre>
<pre><code>## [1]    1 1000</code></pre>
<pre class="r"><code>dimnames(res)</code></pre>
<pre><code>## NULL</code></pre>
<p>
We are used to classifiers knowing the class labels, but this is not the case for keras. Motivated by this, <code>lime</code> now have a way to define/overwrite the class labels of a model, using the <code>as_classifier()</code> function. Let’s redo our explainer:
</p>
<pre class="r"><code>model_labels &lt;- readRDS(system.file('extdata', 'imagenet_labels.rds', package = 'lime'))
explainer &lt;- lime(img_path, as_classifier(model, model_labels), image_prep)</code></pre>
<blockquote class="blockquote">
<p>
There is also an <code>as_regressor()</code> function which tells <code>lime</code>, without a doubt, that the model is a regression model. Most models can be introspected to see which type of model they are, but neural networks doesn’t really care. <code>lime</code> guesses the model type from the activation used in the last layer (linear activation == regression), but if that heuristic fails then <code>as_regressor()</code>/<code>as_classifier()</code> can be used.
</p>
</blockquote>
<p>
We are now ready to poke into the model and find out what makes it think our image is of an Egyptian cat. But… first I’ll have to talk about yet another concept: superpixels (I promise I’ll get to the explanation part in a bit).
</p>
<p>
In order to create meaningful permutations of our image (remember, this is the central idea in <code>lime</code>), we have to define how to do so. The permutations needs to be substantial enough to have an impact on the image, but not so much that the model completely fails to recognise the content in every case - further, they should lead to an interpretable result. The concept of superpixels lends itself well to these constraints. In short, a superpixel is a patch of an area with high homogeneity, and superpixel segmentation is a clustering of image pixels into a number of superpixels. By segmenting the image to explain into superpixels we can turn area of contextual similarity on and off during the permutations and find out if that area is important. It is still necessary to experiment a bit as the optimal number of superpixels depend on the content of the image. Remember, we need them to be large enough to have an impact but not so large that the class probability becomes effectively binary. <code>lime</code> comes with a function to assess the superpixel segmentation before beginning the explanation and it is recommended to play with it a bit — with time you’ll likely get a feel for the right values:
</p>
<pre class="r"><code># default
plot_superpixels(img_path)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-03-06-lime-v0-4-the-kitten-picture-edition_files/figure-html/unnamed-chunk-7-1.png" width="672">
</p>
<pre class="r"><code># Changing some settings
plot_superpixels(img_path, n_superpixels = 200, weight = 40)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-03-06-lime-v0-4-the-kitten-picture-edition_files/figure-html/unnamed-chunk-7-2.png" width="672">
</p>
<p>
The default is set to a pretty low number of superpixels — if the subject of interest is relatively small it may be necessary to increase the number of superpixels so that the full subject does not end up in one, or a few superpixels. The <code>weight</code> parameter will allow you to make the segments more compact by weighting spatial distance higher than colour distance. For this example we’ll stick with the defaults.
</p>
<p>
Be aware that explaining image models is much heavier than tabular or text data. In effect it will create 1000 new images per explanation (default permutation size for images) and run these through the model. As image classification models are often quite heavy, this will result in computation time measured in minutes. The permutation is batched (default to 10 permutations per batch), so you should not be afraid of running out of RAM or hard-drive space.
</p>
<pre class="r"><code>explanation &lt;- explain(img_path, explainer, n_labels = 2, n_features = 20)</code></pre>
<p>
The output of an image explanation is a data frame of the same format as that from tabular and text data. Each feature will be a superpixel and the pixel range of the superpixel will be used as its description. Usually the explanation will only make sense in the context of the image itself, so the new version of <code>lime</code> also comes with a <code>plot_image_explanation()</code> function to do just that. Let’s see what our explanation have to tell us:
</p>
<pre class="r"><code>plot_image_explanation(explanation)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-03-06-lime-v0-4-the-kitten-picture-edition_files/figure-html/unnamed-chunk-8-1.png" width="672">
</p>
<p>
We can see that the model, for both the major predicted classes, focuses on the cat, which is nice since they are both different cat breeds. The plot function got a few different functions to help you tweak the visual, and it filters low scoring superpixels away by default. An alternative view that puts more focus on the relevant superpixels, but removes the context can be seen by using <code>display = ‘block’</code>:
</p>
<pre class="r"><code>plot_image_explanation(explanation, display = 'block', threshold = 0.01)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-03-06-lime-v0-4-the-kitten-picture-edition_files/figure-html/unnamed-chunk-9-1.png" width="672">
</p>
<p>
While not as common with image explanations it is also possible to look at the areas of an image that contradicts the class:
</p>
<pre class="r"><code>plot_image_explanation(explanation, threshold = 0, show_negative = TRUE, fill_alpha = 0.6)</code></pre>
<p>
<img src="https://tutor-church-15580.netlify.app/post/2018-03-06-lime-v0-4-the-kitten-picture-edition_files/figure-html/unnamed-chunk-10-1.png" width="672">
</p>
<p>
As each explanation takes longer time to create and needs to be tweaked on a per-image basis, image explanations are not something that you’ll create in large batches as you might do with tabular and text data. Still, a few explanations might allow you to understand your model better and be used for communicating the workings of your model. Further, as the time-limiting factor in image explanations are the image classifier and not lime itself, it is bound to improve as image classifiers becomes more performant.
</p>
</section>
<section id="grab-back" class="level3">
<h3 class="anchored" data-anchor-id="grab-back">
Grab back
</h3>
<p>
Apart from keras and image support, a slew of other features and improvements have been added. Here’s a quick overview:
</p>
<ul>
<li>
All explanation plots now include the fit of the ridge regression used to make the explanation. This makes it easy to assess how good the assumptions about local linearity are kept.
</li>
<li>
When explaining tabular data the default distance measure is now <code>‘gower’</code> from the <code>gower</code> package. <code>gower</code> makes it possible to measure distances between heterogeneous data without converting all features to numeric and experimenting with different exponential kernels.
</li>
<li>
When explaining tabular data numerical features will no longer be sampled from a normal distribution during permutations, but from a kernel density defined by the training data. This should ensure that the permutations are more representative of the expected input.
</li>
</ul>
</section>
</section>
<section id="wrapping-up" class="level2">
<h2 class="anchored" data-anchor-id="wrapping-up">
Wrapping up
</h2>
<p>
This release represents an important milestone for <code>lime</code> in R. With the addition of image explanations the <code>lime</code> package is now on par or above its Python relative, feature-wise. Further development will focus on improving the performance of the model, e.g.&nbsp;by adding parallelisation or improving the local model definition, as well as exploring alternative explanation types such as <a href="https://homes.cs.washington.edu/%7Emarcotcr/aaai18.pdf">anchor</a>.
</p>
<p>
Happy Explaining!
</p>
</section>



 ]]></description>
  <category>announcement</category>
  <category>lime</category>
  <category>machine learning</category>
  <category>prediction</category>
  <category>modelling</category>
  <guid>https://tutor-church-15580.netlify.app/posts/2018-03-06-lime-v0-4-the-kitten-picture-edition/</guid>
  <pubDate>Tue, 06 Mar 2018 00:00:00 GMT</pubDate>
  <media:content url="https://tutor-church-15580.netlify.app/assets/img/kitten_lime.png" medium="image" type="image/png" height="96" width="144"/>
</item>
</channel>
</rss>
