<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/assets/rss.xsl"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
    <channel>
        <title>Max Bernstein&apos;s Blog</title>
        <description></description>
        <link>https://bernsteinbear.com</link>
        <atom:link href="https://bernsteinbear.com/feed.xml" rel="self" type="application/rss+xml" />
        <item shouldShow="false">
            <title>Sorry for marking all the posts as unread</title>
            <description>
              I noticed that the URLs were all a little off (had two slashes
              instead of one) and went in and fixed it. I did not think
              everyone's RSS software was going to freak out the way it did.

              PS: this is a special RSS-only post that is not visible on the
              site. Enjoy.
            </description>
            <pubDate>Wed, 31 Jan 2024 00:00:00 +0000</pubDate>
            <guid isPermaLink="false">rss-only-post-1</guid>
        </item>
        
        <item>
            <title>Seeing like a JIT</title>
            <description>&lt;p&gt;I work on optimizing dynamic language runtimes at scale.&lt;/p&gt;

&lt;p&gt;A lot of people ask me about my job. Or at least, a lot of people make
statements to me that indicate that they fundamentally misunderstand my job.
Well, I don’t know about a lot—it’s happened a couple of times, anyway.&lt;/p&gt;

&lt;p&gt;But a large part of my job&lt;sup id=&quot;fnref:real-job&quot;&gt;&lt;a href=&quot;#fn:real-job&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; is not fundamentally that complex. It’s
complicated, sure—many moving parts, fiddly interlocking pieces of code,
coordinating people across timezones, languages, and countries—but the goal
is to make Ruby (or insert your choice of dynamic language here) faster, and
economically.&lt;/p&gt;

&lt;p&gt;The first step is understanding this is learning about &lt;em&gt;the fast paths&lt;/em&gt; and
&lt;em&gt;the tradeoffs&lt;/em&gt;. That’s what we’ll talk about in this post.&lt;/p&gt;

&lt;p&gt;TODO also the separation of “if i see that an object has a type or some
property it means it has been guaranteed somewhere else and i need to just
assume that has been taken care of”&lt;/p&gt;

&lt;h2 id=&quot;the-fast-paths&quot;&gt;The fast paths&lt;/h2&gt;

&lt;p&gt;We’ll start with the view from the interpreter&lt;/p&gt;

&lt;h3 id=&quot;data-structures&quot;&gt;Data structures&lt;/h3&gt;

&lt;p&gt;People do a lot of operations on integers&lt;/p&gt;

&lt;p&gt;Most common integers are small (citation needed)&lt;/p&gt;

&lt;p&gt;Split between fixnums and bignums: immediate vs heap-allocated&lt;/p&gt;

&lt;p&gt;Operations on fixnums, though tagged, are very fast&lt;/p&gt;

&lt;h3 id=&quot;static-properties&quot;&gt;Static properties&lt;/h3&gt;

&lt;p&gt;Most programmers don’t use most programming language features.&lt;/p&gt;

&lt;p&gt;Look at the following Ruby code snippet:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;As I and others have &lt;a href=&quot;/blog/typed-python/&quot;&gt;made a habit of saying&lt;/a&gt;, in
languages such as Ruby and Python, this kind of code could do &lt;em&gt;anything&lt;/em&gt;. The
surface syntax of addition is compiled to a virtual method call under the hood.
Inside the interpreter, the opcode handler might look something like this:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;handle_send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;receiver&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lookup_method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;receiver&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;call_method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We, as VM implementors, have additional context: most people use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;+&lt;/code&gt; to mean
integer addition or string concatenation&lt;sup id=&quot;fnref:measure&quot;&gt;&lt;a href=&quot;#fn:measure&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;. So if we inline some very
fast checks into the method send handler, we can execute integer addition and
string concatenation much faster.&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;handle_send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Symbol_PLUS&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nobody_has_overridden_integer_plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_pop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_pop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fixnum_add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Symbol_PLUS&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nobody_has_overridden_string_plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_pop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_pop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;string_concat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;receiver&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lookup_method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;receiver&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;call_method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;These new checks, which only apply to a bytecode-compile-time known subset of
method sends, slow down the other method sends. As (for example) Koichi Sasada
noticed when creating YARV, there is no sense compiling this directly to a very
generic method call, brushing your hands off, and calling it done.&lt;/p&gt;

&lt;p&gt;Instead, we should split out the send into multiple handlers and only have the
bytecode compiler generate this specialized addition opcode when have the right
number of arguments:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;handle_send_plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;assert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nobody_has_overridden_integer_plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;stack_popn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fixnum_add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nobody_has_overridden_string_plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;stack_popn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;string_concat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lookup_method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;call_method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;handle_send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// ... as before ...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You may notice that we still have some situations for which some checks are
known at bytecode-compile-time: we may know the type of the left hand side or
the right hand side. So why not specialize those cases into their own handlers
as well?&lt;/p&gt;

&lt;p&gt;Partially this is because it’s not worth the effort: we’ve already specialized
addition at no cost to other method sends and your C compiler will probably do
a good job optimizing through the helper function calls.&lt;/p&gt;

&lt;p&gt;But also, code, especially hand-written code, comes with a maintenance burden.
Are you going to manually deal with all of this&lt;sup id=&quot;fnref:deegen&quot;&gt;&lt;a href=&quot;#fn:deegen&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;? No, that would be a
total bummer.&lt;/p&gt;

&lt;h3 id=&quot;dynamic-properties&quot;&gt;Dynamic properties&lt;/h3&gt;

&lt;p&gt;There are other properties we care to check: the method override. Even if we
make this override check cheap—a load and a compare, perhaps—it’s still
significant overhead for adding two small numbers.&lt;/p&gt;

&lt;p&gt;And most people who run high-performance and high-assurance applications don’t
patch the standard library! If it does happen, it is in testing libraries or
for some kind of infrequent tracing.&lt;/p&gt;

&lt;p&gt;Instead of eagerly checking, we could lazily deoptimize.&lt;/p&gt;

&lt;p&gt;We could use &lt;a href=&quot;/assets/img/ic-meets-quickening.pdf&quot;&gt;quickening&lt;/a&gt; (PDF) to generate specialized versions of
the opcode handlers that can assume &lt;em&gt;without checking&lt;/em&gt; that integer plus and
string plus have not been tampered with. Then, only if the methods get tampered
with, re-write all of the specialized opcodes that depend on this property to
the more generic version.&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;handle_send_plus&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;assert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;stack_popn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fixnum_add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;is_string&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;stack_popn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;string_concat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lookup_method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;call_method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stack_ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;VALUE&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;handle_send&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;argc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// ... as before ...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;in-a-jit&quot;&gt;In a JIT&lt;/h3&gt;

&lt;p&gt;Now we are in the JIT compiler. Many JIT compilers take the existing
interpreter bytecode and transform it into an intermediate representation (IR)
more suitable for optimization.&lt;/p&gt;

&lt;p&gt;A stack based bytecode, for example, might get turned into an SSA IR where the
stack has been folded away at compile-time. Take this snippet of invented
stack-based bytecode, where the way to transfer data between instructions is
the stack:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;push 1
push 2
add
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This bytecode has its stack unrolled at compile-time by abstract
interpretation. As the compiler iterates over the bytecode instructions, it
creates an IR node for each instruction. It pushes the address of that IR node
onto a compile-time stack. Then, when an opcode needs input operands, they are
read off the compile-time stack. This gives the add instruction (in this case)
direct pointers to the left and right operands.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0 = 1
v1 = 2
v2 = add v0 v1
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It looks like we are assigning variable names here but pretend instead of these
made-up vNNN names there are arrows/pointers/… between instructions. This is
a dataflow graph, where uses (instructions) point to defs (their operands).&lt;/p&gt;

&lt;p&gt;This is so useful because whenever we are doing some kind of local optimization
on an instruction, we need only do a bit of light pointer chasing to learn
about the properties of the operands. “Oh, the type associated with v0 is int?”
You can look it up right on the IR node.&lt;/p&gt;

&lt;p&gt;(More about this in &lt;a href=&quot;/blog/irs/&quot;&gt;What I talk about when I talk about IRs&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;Then, by the time we hit the compiler, we probably have some amount of profile info&lt;/p&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:real-job&quot;&gt;
      &lt;p&gt;While the platonic ideal of my job might look like graph transforms,
reading through how the current Intel microarchitecture works, bug fixing, and
copying HotSpot’s (or insert your choice of dynamic language platform here)
engineering decisions, it’s mostly &lt;em&gt;not&lt;/em&gt; that. About 50% of my job is
convincing different audiences of different things.&lt;/p&gt;

      &lt;p&gt;For the purposes of this post, we will ignore the internal parts of my job that
don’t involve compilers. I’m not yet feeling confidently metacognitive
enough about those bits to do &lt;em&gt;thought leadership&lt;/em&gt;. &lt;a href=&quot;#fnref:real-job&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:measure&quot;&gt;
      &lt;p&gt;Citation needed, but people are on average not very creative. To
determine if this is the case for your programming system, you must measure
and get some statistics for the applications you care about. &lt;a href=&quot;#fnref:measure&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:deegen&quot;&gt;
      &lt;p&gt;deegen &lt;a href=&quot;#fnref:deegen&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Fri, 30 Jan 2026 00:00:00 +0000</pubDate>
            <niceDate>January 30, 2026</niceDate>
            <link>https://bernsteinbear.com/blog/seeing-like-a-jit/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/seeing-like-a-jit/</guid>
        </item>
        
        <item>
            <title>The GDB JIT interface</title>
            <description>&lt;p&gt;GDB is great for stepping through machine code to figure out what is going on.
It uses debug information under the hood to present you with a tidy backtrace
and also determine how much machine code to print when you type &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;disassemble&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This debug information comes from your compiler. Clang, GCC, rustc, etc all
produce debug data in a format called &lt;a href=&quot;https://dwarfstd.org/&quot;&gt;DWARF&lt;/a&gt; and then embed that debug
information inside the binary (ELF, Mach-O, …) when you do &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-ggdb&lt;/code&gt; or
equivalent.&lt;/p&gt;

&lt;p&gt;Unfortunately, this means that by default, GDB has no idea what is going on if
you break in a JIT-compiled function. You can step instruction-by-instruction
and whatnot, but that’s about it. This is because the current instruction
pointer is nowhere to be found in any of the existing debug info tables from
the host runtime code, so your terminal is filled with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;???&lt;/code&gt;. See this example
from the V8 docs:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#8  0x08281674 in v8::internal::Runtime_SetProperty (args=...) at src/runtime.cc:3758
#9  0xf5cae28e in ?? ()
#10 0xf5cc3a0a in ?? ()
#11 0xf5cc38f4 in ?? ()
#12 0xf5cbef19 in ?? ()
#13 0xf5cb09a2 in ?? ()
#14 0x0809e0a5 in v8::internal::Invoke (...) at src/execution.cc:97
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Fortunately, there is a &lt;em&gt;JIT interface&lt;/em&gt; to GDB. If you implement a couple of
functions in your JIT and run them every time you finish compiling a function,
you can get the debugging niceties for your JIT code too. See again a V8
example:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;#6  0x082857fc in v8::internal::Runtime_SetProperty (args=...) at src/runtime.cc:3758
#7  0xf5cae28e in ?? ()
#8  0xf5cc3a0a in loop () at test.js:6
#9  0xf5cc38f4 in test.js () at test.js:13
#10 0xf5cbef19 in ?? ()
#11 0xf5cb09a2 in ?? ()
#12 0x0809e1f9 in v8::internal::Invoke (...) at src/execution.cc:97
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Unfortunately, the GDB docs are &lt;a href=&quot;https://sourceware.org/gdb/current/onlinedocs/gdb.html/JIT-Interface.html&quot;&gt;somewhat sparse&lt;/a&gt;. So I went
spelunking through a bunch of different projects to try and understand what is
going on.&lt;/p&gt;

&lt;h2 id=&quot;the-big-picture-and-the-old-interface&quot;&gt;The big picture (and the old interface)&lt;/h2&gt;

&lt;p&gt;GDB expects your runtime to expose a function called
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt; and a global variable called
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_descriptor&lt;/code&gt;. GDB automatically adds its own internal breakpoints
at this function, if it exists. Then, when you compile code, you call this
function from your runtime.&lt;/p&gt;

&lt;p&gt;In slightly more detail:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Compile a function in your JIT compiler. This gives you a function name,
maybe other metadata, an executable code address, and a code size&lt;/li&gt;
  &lt;li&gt;Generate an &lt;em&gt;entire&lt;/em&gt; ELF/Mach-O/… object in-memory (!) for that one
function, describing its name, code region, maybe other DWARF metadata such
as line number maps&lt;/li&gt;
  &lt;li&gt;Write a &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;jit_code_entry&lt;/code&gt; linked list node that points at your object
(“symfile”)&lt;/li&gt;
  &lt;li&gt;Link it into the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_descriptor&lt;/code&gt; linked list&lt;/li&gt;
  &lt;li&gt;Call &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt;, which gives GDB control of the process so it can
pick up the new function’s metadata&lt;/li&gt;
  &lt;li&gt;Optionally, break into (or crash inside) one of your JITed functions&lt;/li&gt;
  &lt;li&gt;At some point, later, when your function gets GCed, unregister your code by
editing the linked list and calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;__jit_debug_register_code&lt;/code&gt; again&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is why you see compiler projects such as V8 including large swaths of code
just to make object files:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/v8/v8/blob/5668ed57de1c7c8dd5c3dc1598bf071e17d29c8c/src/diagnostics/gdb-jit.cc&quot;&gt;V8&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/facebookincubator/cinderx/blob/e6e925b20e6fa3fe1e100f147e1c8cd03076ebfb/cinderx/Jit/jit_gdb_support.cpp&quot;&gt;Cinder&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/zendtech/php-src/blob/f82e5b3abe1ff1d3ffc7954b0810bc584fd650a5/ext/opcache/jit/zend_jit_gdb.c#L473&quot;&gt;Zend PHP&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/dotnet/runtime/blob/3c040478f19e0f317790acab05dbe3ada9f52dc4/src/coreclr/vm/gdbjit.cpp&quot;&gt;CoreCLR/.NET&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/qemu/qemu/blob/942b0d378a1de9649085ad6db5306d5b8cef3591/tcg/tcg.c#L7064&quot;&gt;QEMU&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/WebKit/WebKit/blob/0afc2a867ab45651ac6c353c7b6ade5482b7bba7/Source/JavaScriptCore/jit/GdbJIT.cpp&quot;&gt;JavaScriptCore&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/LuaJIT/LuaJIT/blob/7152e15489d2077cd299ee23e3d51a4c599ab14f/src/lj_gdbjit.c&quot;&gt;LuaJIT&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/LineageOS/android_art/blob/8ce603e0c68899bdfbc9cd4c50dcc65bbf777982/runtime/jit/debugger_interface.cc#L187&quot;&gt;ART&lt;/a&gt;
    &lt;ul&gt;
      &lt;li&gt;which looks like it does something smart about grouping the JIT code
entries together (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RepackEntries&lt;/code&gt;), but I’m not sure exactly what it does&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/facebook/hhvm/blob/b1c47dcfbc574b508fd084f27ba4a06bcf4ba188/hphp/runtime/vm/debug/elfwriter.cpp#L622&quot;&gt;HHVM&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/TomatOrg/TomatoDotNet/blob/80266bb8dc0e7f0644f0638ecd98dfad4fb74427/src/dotnet/jit/gdb.c&quot;&gt;TomatoDotNet&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/jatovm/jato/blob/bb1c7d4fd987e016b2e0379182c4bfbb8c1c1a78/jit/elf.c#L164&quot;&gt;Jato JVM&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://gist.github.com/yyny/4a012029b5889853c18b1efc19bb598e&quot;&gt;a minimal example&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/sisshiki1969/jit-debug/blob/213c72512761f815fc0b067ce68ee0ae12962e2a/src/main.rs&quot;&gt;monoruby&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/mono/mono/blob/0f53e9e151d92944cacab3e24ac359410c606df6/mono/mini/dwarfwriter.c&quot;&gt;Mono&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;It looks like Dart &lt;a href=&quot;https://github.com/dart-lang/sdk/commit/c4238c71da13d61ff32332058d371c5b2e92694b&quot;&gt;used to&lt;/a&gt;
have support for this but has since removed it&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/bytecodealliance/wasmtime/blob/b5272a5f103053f5ada2a38d5302a8d1e2de442d/crates/wasmtime/src/runtime/code_memory.rs#L509&quot;&gt;wasmtime&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Because this is a huge hassle, GDB also has a newer interface that does not
require making an ELF/Mach-O/…+DWARF object.&lt;/p&gt;

&lt;h2 id=&quot;custom-debug-info-the-new-interface&quot;&gt;Custom debug info (the new interface)&lt;/h2&gt;

&lt;p&gt;This new interface requires writing a binary format of your choice. You make
the writer and you make the reader. Then, when you are in GDB, you load your
reader as a shared object.&lt;/p&gt;

&lt;p&gt;The reader must implement &lt;a href=&quot;https://sourceware.org/gdb/current/onlinedocs/gdb.html/Writing-JIT-Debug-Info-Readers.html#Writing-JIT-Debug-Info-Readers&quot;&gt;the interface specified by GDB&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;GDB_DECLARE_GPL_COMPATIBLE_READER&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;extern&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gdb_reader_funcs&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;gdb_init_reader&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gdb_reader_funcs&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;cm&quot;&gt;/* Must be set to GDB_READER_INTERFACE_VERSION.  */&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;int&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;reader_version&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

  &lt;span class=&quot;cm&quot;&gt;/* For use by the reader.  */&lt;/span&gt;
  &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;priv_data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;gdb_read_debug_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gdb_unwind_frame&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;unwind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gdb_get_frame_id&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;get_frame_id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gdb_destroy_reader&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;destroy&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;read&lt;/code&gt; function pointer does the bulk of the work and is responsible for
matching code ranges to function names, line numbers, and more.&lt;/p&gt;

&lt;p&gt;Here are &lt;a href=&quot;https://pwparchive.wordpress.com/2011/11/20/new-jit-interface-for-gdb/&quot;&gt;some details from Sanjoy Das&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Only a few runtimes implement this interface. Most of them stub out the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unwind&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;get_frame_id&lt;/code&gt; function pointers:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/ykjit/yk/blob/755e533aa74ef5fa82a6586147727e23146b95fc/ykrt/src/compile/jitc_yk/gdb.rs#L216&quot;&gt;yk write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/ykjit/yk/blob/755e533aa74ef5fa82a6586147727e23146b95fc/ykrt/yk_gdb_plugin/yk_gdb_plugin.c#L22&quot;&gt;yk read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/tetzank/asmjit-utilities/blob/2fdbb99f7e002df4f8d7aa97c29910743adfc991/gdb/gdbjit.cpp&quot;&gt;asmjit-utilities write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/tetzank/asmjit-utilities/blob/2fdbb99f7e002df4f8d7aa97c29910743adfc991/gdb/jit-reader/gdbjit-reader.c&quot;&gt;asmjit-utilities read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/erlang/otp/blob/28a44634fb04b95ea666abb8aac7254e2c87ae05/erts/emulator/beam/jit/beam_jit_metadata.cpp#L123&quot;&gt;Erlang/OTP write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/erlang/otp-gdb-tools/blob/7b864f58c534699e4124e31ecfda86041b941037/jit-reader.c&quot;&gt;Erlang/OTP read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/FEX-Emu/FEX/blob/c8d72eabe589392b962bec94d002c5ffdb7381c2/FEXCore/Source/Interface/GDBJIT/GDBJIT.cpp#L110&quot;&gt;FEX write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/FEX-Emu/FEX/blob/c8d72eabe589392b962bec94d002c5ffdb7381c2/Source/Tools/FEXGDBReader/FEXGDBReader.cpp#L8&quot;&gt;FEX read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/bullno1/buxn-jit/blob/69effb96d5fe9725258fe367efcefd6911ef32fd/src/gdb/hook.c&quot;&gt;buxn-jit write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/bullno1/buxn-jit/blob/69effb96d5fe9725258fe367efcefd6911ef32fd/src/gdb/reader.c&quot;&gt;buxn-jit read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/KreitinnSoftware/box64/blob/f224a93cc83f9da34bc85ebb5414168d476a135d/src/tools/gdbjit.c#L45&quot;&gt;box64 write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://github.com/KreitinnSoftware/box64/blob/f224a93cc83f9da34bc85ebb5414168d476a135d/gdbjit/reader.c&quot;&gt;box64 read&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/no-defun-allowed/ccl/blob/094a9ec5bf203db118e0ffc8ce2b5b80fc1c91dd/lisp-kernel/gdb.c&quot;&gt;ccl write&lt;/a&gt; &lt;br /&gt;
&lt;a href=&quot;https://gist.github.com/no-defun-allowed/32d38c5e664586c724cf2e0e97f0d2b1&quot;&gt;ccl read&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I think it also requires at least the reader to proclaim it is GPL via the
macro &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GDB_DECLARE_GPL_COMPATIBLE_READER&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Since I wrote about the &lt;a href=&quot;/blog/jit-perf-map/&quot;&gt;perf map interface&lt;/a&gt; recently, I
have it on my mind. Why can’t we reuse it in GDB?&lt;/p&gt;

&lt;h2 id=&quot;adapting-to-the-linux-perf-interface&quot;&gt;Adapting to the Linux perf interface&lt;/h2&gt;

&lt;p&gt;I suppose it would be possible to try and upstream a patch to GDB to support
the Linux perf map interface for JITs. After all, why shouldn’t it be able to
automatically pick up symbols from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/tmp/perf-...&lt;/code&gt;? That would be great
baseline debug info for “free”.&lt;/p&gt;

&lt;p&gt;In the meantime, maybe it is reasonable to create a re-usable custom debug
reader:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;When registering code, write the address and name to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/tmp/perf-...&lt;/code&gt; as you normally would&lt;/li&gt;
  &lt;li&gt;Write the filename as the symfile (does this make &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/tmp&lt;/code&gt; the magic number?)&lt;/li&gt;
  &lt;li&gt;Have the debug info reader just parse the perf map file&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It would be less flexible than both the DWARF and custom readers support: it
would only be able to handle filename and code region. No embedding source code
for GDB to display in your debugger. But maybe that is okay for a partial
solution?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; Here is &lt;a href=&quot;https://github.com/tekknolagi/gdb-jit-linux-perf-map&quot;&gt;my small attempt&lt;/a&gt;
at such a plugin.&lt;/p&gt;

&lt;h2 id=&quot;the-n-squared-problem&quot;&gt;The n-squared problem&lt;/h2&gt;

&lt;p&gt;V8 notes in their &lt;a href=&quot;https://v8.dev/docs/gdb-jit&quot;&gt;GDB JIT docs&lt;/a&gt; that because the JIT interface is
a linked list and we only keep a pointer to the head, we get O(n&lt;sup&gt;2&lt;/sup&gt;)
behavior. Bummer. This becomes especially noticeable since they register
additional code objects not just for functions, but also trampolines, cache
stubs, etc.&lt;/p&gt;

&lt;h2 id=&quot;garbage-collection&quot;&gt;Garbage collection&lt;/h2&gt;

&lt;p&gt;Since GDB expects the code pointer in your symbol object file not to move, you
have to make sure to have a stable symbol file pointer and stable executable
code pointer. To make this happen, V8 disables its moving GC.&lt;/p&gt;

&lt;p&gt;Additionally, if your compiled function gets collected, you have to make sure
to unregister the function. Instead of doing this eagerly, ART treats the GDB
JIT linked list as a weakref and periodically removes dead code entries from
it.&lt;/p&gt;
</description>
            <pubDate>Tue, 30 Dec 2025 00:00:00 +0000</pubDate>
            <niceDate>December 30, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/gdb-jit/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/gdb-jit/</guid>
        </item>
        
        <item>
            <title>Load and store forwarding in the Toy Optimizer</title>
            <description>&lt;p&gt;&lt;em&gt;Another entry in the &lt;a href=&quot;https://pypy.org/categories/toy-optimizer.html&quot;&gt;Toy Optimizer series&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;A long, long time ago (two years!) &lt;a href=&quot;https://cfbolz.de/&quot;&gt;CF Bolz-Tereick&lt;/a&gt; and I made a &lt;a href=&quot;https://www.youtube.com/watch?v=w-UHg0yOPSE&quot;&gt;video
about load/store forwarding&lt;/a&gt; and an accompanying &lt;a href=&quot;https://gist.github.com/tekknolagi/4e3fa26d350f6d3b39ede40d372b97fe&quot;&gt;GitHub Gist&lt;/a&gt;
about load/store forwarding (also called load elimination) in the Toy Optimizer. I
said I would write a blog post about it, but never found the time—it got lost
amid a sea of large life changes.&lt;/p&gt;

&lt;p&gt;It’s a neat idea: do an abstract interpretation over the trace, modeling the
heap at compile-time, eliminating redundant loads and stores. That means it’s
possible to optimize traces like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0 = ...
v1 = load(v0, 5)
v2 = store(v0, 6, 123)
v3 = load(v0, 6)
v4 = load(v0, 5)
v5 = do_something(v1, v3, v4)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;into traces like this:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0 = ...
v1 = load(v0, 5)
v2 = store(v0, 6, 123)
v5 = do_something(v1, 123, v1)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;load(v0, 5)&lt;/code&gt; is equivalent to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*(v0+5)&lt;/code&gt; in C syntax and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;store(v0, 6,
123)&lt;/code&gt; is equvialent to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;*(v0+6)=123&lt;/code&gt; in C syntax)&lt;/p&gt;

&lt;p&gt;This indicates that we were able to eliminate two redundant loads by keeping
around information about previous loads and stores. Let’s get to work making
this possible.&lt;/p&gt;

&lt;h2 id=&quot;the-usual-infrastructure&quot;&gt;The usual infrastructure&lt;/h2&gt;

&lt;p&gt;We’ll start off with the usual infrastructure from the &lt;a href=&quot;https://pypy.org/categories/toy-optimizer.html&quot;&gt;Toy
Optimizer series&lt;/a&gt;: a very stringly-typed representation of a
&lt;a href=&quot;https://gist.github.com/tekknolagi/4e3fa26d350f6d3b39ede40d372b97fe#file-port-py-L4-L112&quot;&gt;trace-based SSA IR&lt;/a&gt; and a union-find rewrite mechanism.&lt;/p&gt;

&lt;p&gt;This means we can start writing some new optimization pass and our first test:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# TODO: copy an optimized version of bb into opt_bb
&lt;/span&gt;    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_two_loads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = load(var0, 0)
var2 = escape(var1)
var3 = escape(var1)&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This test is asserting that we can remove duplicate loads. Why load twice if we
can cache the result? Let’s make that happen.&lt;/p&gt;

&lt;h2 id=&quot;caching-loads&quot;&gt;Caching loads&lt;/h2&gt;

&lt;p&gt;To do this, we’ll model the the heap at compile-time. When I say “model”, I
mean that we will have an imprecise but correct abstract representation of the
heap: we don’t (and can’t) have knowledge of every value, but we can know for
sure that some addresses have certain values.&lt;/p&gt;

&lt;p&gt;For example, if we have observed a load from object &lt;em&gt;O&lt;/em&gt; at offset &lt;em&gt;8&lt;/em&gt; &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v0 =
load(O, 8)&lt;/code&gt;, we know that the SSA value &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v0&lt;/code&gt; is at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;heap[(O, 8)]&lt;/code&gt;. That sounds
tautological, but it’s not. Future loads can make use of this information.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Operation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;index&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Stores things we know about the heap at... compile-time.
&lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# Key: an object and an offset pair acting as a heap address
&lt;/span&gt;    &lt;span class=&quot;c1&quot;&gt;# Value: a previous SSA value we know exists at that address
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;previous&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;previous&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;make_equal_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;previous&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This pass records information about loads and uses the result of a previous
cached load operation if available. We treat the pair of (SSA value, offset) as
an address into our abstract heap.&lt;/p&gt;

&lt;p&gt;That’s great! If you run our simple test, it should now pass. But what happens
if we store into that address before the second load? Oops…&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_store_to_same_object_offset_invalidates_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = load(var0, 0)
var2 = store(var0, 0, 5)
var3 = load(var0, 0)
var4 = escape(var1)
var5 = escape(var3)&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This test fails because we are incorrectly keeping around &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var1&lt;/code&gt; in our
abstract heap. We need to get rid of it and not replace &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var3&lt;/code&gt; with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var1&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;invalidating-cached-loads&quot;&gt;Invalidating cached loads&lt;/h2&gt;

&lt;p&gt;So it turns out we have to also model stores in order to cache loads correctly.
One valid, albeit aggressive, way to do that is to throw away all the
information we know at each store operation:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;clear&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That makes our test pass—yay!—but at great cost. It means any store
operation mucks up redundant loads. In our world where we frequently read from
and write to objects, this is what we call a huge bummer.&lt;/p&gt;

&lt;p&gt;For example, a store to offset 4 on some object should never interfere with a
load from a different offset on the same object&lt;sup id=&quot;fnref:size&quot;&gt;&lt;a href=&quot;#fn:size&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;. We should be able to
keep our load from offset 0 cached here:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_store_to_same_object_different_offset_does_not_invalidate_load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = load(var0, 0)
var2 = store(var0, 4, 5)
var3 = escape(var1)
var4 = escape(var1)&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We could try instead checking if our specific (object, offset) pair is in the
heap and only removing cached information about that offset and that object.
That would definitely help!&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;del&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It makes our test pass, too, which is great news.&lt;/p&gt;

&lt;p&gt;Unfortunately, this runs into problems due to aliasing: it’s entirely possible
that our compile-time heap could contain a pair &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(v0, 0)&lt;/code&gt; and a pair &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(v1, 0)&lt;/code&gt; where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v0&lt;/code&gt;
and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v1&lt;/code&gt; are the same object (but not known to the optimizer). Then we might
run into a situation where we incorrectly cache loads because the optimizer
doesn’t know our abstract addresses &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(v0, 0)&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(v1, 0)&lt;/code&gt; are actually the
same pointer at run-time.&lt;/p&gt;

&lt;p&gt;This means that we are breaking abstract interpretation rules: our abstract
interpreter has to correctly model &lt;em&gt;all&lt;/em&gt; possible outcomes at run-time. This
means to me that we should instead pick some tactic in-between clearing all
information (correct but over-eager) and clearing only exact matches of
object+offset (incorrect).&lt;/p&gt;

&lt;p&gt;The term that will help us here is called an &lt;em&gt;alias class&lt;/em&gt;. It is a name for a
way to efficiently partition objects in your abstract heap into completely
disjoint sets. Writes to any object in one class never affect objects in
another class.&lt;/p&gt;

&lt;p&gt;Our very scrappy alias classes will be just based on the offset: each offset is
a different alias class. If we write to any object at offset K, we have to
invalidate all of our compile-time offset K knowledge—even if it’s for
another object. This is a nice middle ground, and it’s possible because our
(made up) object system guarantees that distinct objects do not overlap, and
also that we are not writing out-of-bounds.&lt;sup id=&quot;fnref:tbaa&quot;&gt;&lt;a href=&quot;#fn:tbaa&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;&lt;/p&gt;

&lt;p&gt;So let’s remove all of the entries from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;compile_time_heap&lt;/code&gt; where the offset
matches the offset in the current &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;store&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;items&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Great! Now our test passes.&lt;/p&gt;

&lt;p&gt;This concludes the load optimization section of the post. We have modeled
enough of loads and stores that we can eliminate redundant loads. Very cool.
But we can go further.&lt;/p&gt;

&lt;h2 id=&quot;caching-stores&quot;&gt;Caching stores&lt;/h2&gt;

&lt;p&gt;Stores don’t just invalidate information. They also give us new information!
Any time we see an operation of the form &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v1 = store(v0, 8, 5)&lt;/code&gt; we also learn
that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;load(v0, 8) == 5&lt;/code&gt;! Until it gets invalidated, anyway.&lt;/p&gt;

&lt;p&gt;For example, in this test, we can eliminate the load from &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;var0&lt;/code&gt; at offset 0:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_load_after_store_removed&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = store(var0, 0, 5)
var2 = load(var0, 1)
var3 = escape(5)
var4 = escape(var2)&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Making that work is thankfully not very hard; we need only add that new
information to the compile-time heap after removing all the
potentially-aliased info:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# ... as before ...
&lt;/span&gt;            &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# NEW!
&lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This makes the test pass. It makes another test fail, but only
because—oops—we now know more. You can delete the old test because the new
test supersedes it.&lt;/p&gt;

&lt;p&gt;Now, note that we are not removing the store. This is because we have nothing
in our optimizer that keeps track of what might have observed the side-effects
of the store. What if the object got &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;escape&lt;/code&gt;d? Or someone did a load later on?
We would only be able to remove the store (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;continue&lt;/code&gt;) if we could guarantee it
was not observable.&lt;/p&gt;

&lt;p&gt;In our current framework, this only happens in one case: someone is doing a
store of the exact same value that already exists in our compile-time heap.
That is, either the same constant, or the same SSA value. If we see this, then
we can completely skip the second store instruction.&lt;/p&gt;

&lt;p&gt;Here’s a test case for that, where we have gained information from the load
instruction that we can then use to get rid of the store instruction:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_load_then_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = load(var0, 0)
var2 = escape(var1)&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s make it pass. To do that, first we’ll make an equality function that
works for both constants and operations. Constants are equal if their values
are equal, and operations are equal if they are the identical (by
address/pointer) operation.&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;eq_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;isinstance&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is a partial equality: if two operations are not equal under &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;eq_value&lt;/code&gt;,
it doesn’t mean that they are different, only that we don’t know that they are
the same.&lt;/p&gt;

&lt;p&gt;Then, after that, we need only check if the current value in the compile-time
heap is the same as the value being stored in. If it is, wonderful. No need to
store. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;continue&lt;/code&gt; and don’t append the operation to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;opt_bb&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;int&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;store_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;obj&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;offset&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;current_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;get&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;store_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;eq_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;current_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# NEW!
&lt;/span&gt;                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# ... as before ...
&lt;/span&gt;            &lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;        &lt;span class=&quot;k&quot;&gt;elif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;arg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;make_equal_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;continue&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;compile_time_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;load_info&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This makes our load-then-store pass and it also makes other tests pass too,
like eliminating a store after another store!&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_store_after_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = store(var0, 0, 5)&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Unfortunately, this only works if the values—constants or SSA values—are
known to be the same. If we store &lt;em&gt;different&lt;/em&gt; values, we can’t optimize. In the
live stream, we left this an exercise for the viewer:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;@pytest.mark.xfail&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_exercise_for_the_reader&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;arg0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arg0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;escape&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;var2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_load_store&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;assert&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;bb_to_str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opt_bb&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\
&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;var0 = getarg(0)
var1 = store(var0, 0, 7)
var2 = escape(7)&lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We would only be able to optimize this away if we had some notion of a store
being &lt;em&gt;dead&lt;/em&gt;. In this case, that is a store in which the value is never read
before being overwritten.&lt;/p&gt;

&lt;h2 id=&quot;removing-dead-stores&quot;&gt;Removing dead stores&lt;/h2&gt;

&lt;p&gt;TODO, I suppose. I have not gotten this far yet. If I get around to it, I will
come back and update the post.&lt;/p&gt;

&lt;h2 id=&quot;in-the-real-world&quot;&gt;In the real world&lt;/h2&gt;

&lt;p&gt;This small optimization pass may seem silly or fiddly—when would we ever see
something like this in a real IR?—but it’s pretty useful. Here’s the Ruby
code that got me thinking about it again some years later for ZJIT:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;C&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;initialize&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;CRuby has a shape system and ZJIT makes use of it, so we end up optimizing this
code (if it’s monomorphic) into a series of shape checks and stores. The HIR
might end up looking something like the mess below, where I’ve annotated the
shape guards (can be thought of as loads) and stores with asterisks:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;fn initialize@tmp/init.rb:3:
# ...
bb2(v6:BasicObject):
  v10:Fixnum[1] = Const Value(1)
  v31:HeapBasicObject = GuardType v6, HeapBasicObject
* v32:HeapBasicObject = GuardShape v31, 0x400000
* StoreField v32, :@a@0x10, v10
  WriteBarrier v32, v10
  v35:CShape[0x40008e] = Const CShape(0x40008e)
* StoreField v32, :_shape_id@0x4, v35
  v16:Fixnum[2] = Const Value(2)
  v37:HeapBasicObject = GuardType v6, HeapBasicObject
* v38:HeapBasicObject = GuardShape v37, 0x40008e
* StoreField v38, :@b@0x18, v16
  WriteBarrier v38, v16
  v41:CShape[0x40008f] = Const CShape(0x40008f)
* StoreField v38, :_shape_id@0x4, v41
  v22:Fixnum[3] = Const Value(3)
  v43:HeapBasicObject = GuardType v6, HeapBasicObject
* v44:HeapBasicObject = GuardShape v43, 0x40008f
* StoreField v44, :@c@0x20, v22
  WriteBarrier v44, v22
  v47:CShape[0x400090] = Const CShape(0x400090)
* StoreField v44, :_shape_id@0x4, v47
  CheckInterrupts
  Return v22
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If we had store-load forwarding in ZJIT, we could get rid of the intermediate
shape guards; they would know the shape from the previous &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StoreField&lt;/code&gt;
instruction. If we had dead store elimination, we could get rid of the
intermediate shape writes; they are never read. (And the repeated type guards
to check if it’s a heap object still are just silly and need to get removed
eventually.)&lt;/p&gt;

&lt;p&gt;This is on the roadmap and will make object initialization even faster than it
is right now.&lt;/p&gt;

&lt;h2 id=&quot;wrapping-up&quot;&gt;Wrapping up&lt;/h2&gt;

&lt;p&gt;Thanks for reading the text version of the video that CF and I made a while
back. Now you know how to do load/store elimination on traces.&lt;/p&gt;

&lt;p&gt;I think this does not need too much extra work to get it going on full CFGs; a
block is pretty much the same as a trace, so you can do a block-local version
without much fuss. If you want to go global, you need dominator information and
gen-kill sets.&lt;/p&gt;

&lt;p&gt;Maybe I will touch on this in a future post…&lt;/p&gt;

&lt;h2 id=&quot;thank-you&quot;&gt;Thank you&lt;/h2&gt;

&lt;p&gt;Thank you to CF, who walked me through this live on a stream two years ago!
This blog post wouldn’t be possible without you.&lt;/p&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:size&quot;&gt;
      &lt;p&gt;In this toy optimizer example, we are assuming that all reads and writes
are the same size and different offsets don’t overlap at all. This is often
the case for managed runtimes, where object fields are pointer-sized and
all reads/writes are pointer-aligned. &lt;a href=&quot;#fnref:size&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:tbaa&quot;&gt;
      &lt;p&gt;We could do better. If we had type information, we could also use that
to make alias classes. Writes to a List will never overlap with writes to a
Map, for example. This requires your compiler to have strict aliasing—if
you can freely cast between types, as in C, then this tactic goes out the
window.&lt;/p&gt;

      &lt;p&gt;This is called &lt;a href=&quot;/assets/img/tbaa.pdf&quot;&gt;Type-based alias analysis&lt;/a&gt; (PDF). &lt;a href=&quot;#fnref:tbaa&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Wed, 24 Dec 2025 00:00:00 +0000</pubDate>
            <niceDate>December 24, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/toy-load-store/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/toy-load-store/</guid>
        </item>
        
        <item>
            <title>ZJIT is now available in Ruby 4.0</title>
            <description>&lt;p&gt;&lt;em&gt;Originally published on &lt;a href=&quot;https://railsatscale.com/2025-12-24-launch-zjit/&quot;&gt;Rails At Scale&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;ZJIT is a new just-in-time (JIT) Ruby compiler built into the reference Ruby
implementation, &lt;a href=&quot;https://en.wikipedia.org/wiki/YARV&quot;&gt;YARV&lt;/a&gt;, by the same compiler group that brought you YJIT.
We (Aaron Patterson, Aiden Fox Ivey, Alan Wu, Jacob Denbeaux, Kevin Menard, Max
Bernstein, Maxime Chevalier-Boisvert, Randy Stauner, Stan Lo, and Takashi
Kokubun) have been working on ZJIT since the beginning of this year.&lt;/p&gt;

&lt;p&gt;In case you missed the last post, we’re building a new compiler for Ruby
because we want to both raise the performance ceiling (bigger compilation unit
size and SSA IR) and encourage more outside contribution (by becoming a more
traditional method compiler).&lt;/p&gt;

&lt;p&gt;It’s been a long time since we gave an official update on ZJIT. Things are
going well. We’re excited to share our progress with you. We’ve done a lot
&lt;a href=&quot;/blog/merge-zjit/&quot;&gt;since May&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;in-brief&quot;&gt;In brief&lt;/h2&gt;

&lt;p&gt;ZJIT is compiled by default—but not enabled by default—in Ruby 4.0. Enable
it by passing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit&lt;/code&gt; flag or the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUBY_ZJIT_ENABLE&lt;/code&gt; environment variable
or calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RubyVM::ZJIT.enable&lt;/code&gt; after starting your application.&lt;/p&gt;

&lt;p&gt;It’s faster than the interpreter, but not yet as fast as YJIT. &lt;strong&gt;Yet.&lt;/strong&gt; But we
have a plan, and we have some more specific numbers below. The TL;DR is we have
a great new foundation and now need to pull out all the Ruby-specific stops to
match YJIT.&lt;/p&gt;

&lt;p&gt;We encourage you to experiment with ZJIT, but maybe hold off on deploying it in
production for now. This is a very new compiler. You should expect crashes and
wild performance degradations (or, perhaps, improvements). Please test locally,
try to run CI, etc, and let us know what you run into on &lt;a href=&quot;https://bugs.ruby-lang.org/projects/ruby-master/issues?set_filter=1&amp;amp;tracker_id=1&quot;&gt;the Ruby issue
tracker&lt;/a&gt; (or, if you don’t want to make a Ruby Bugs account, we would
also take reports &lt;a href=&quot;https://github.com/Shopify/ruby/issues&quot;&gt;on GitHub&lt;/a&gt;).&lt;/p&gt;

&lt;h2 id=&quot;state-of-the-compiler&quot;&gt;State of the compiler&lt;/h2&gt;

&lt;p&gt;To underscore how much has happened since the &lt;a href=&quot;/blog/merge-zjit/&quot;&gt;announcement of being merged
into CRuby&lt;/a&gt;, we present to you a series of comparisons:&lt;/p&gt;

&lt;h3 id=&quot;side-exits&quot;&gt;Side-exits&lt;/h3&gt;

&lt;p&gt;Back in May, we could not side-exit from JIT code into the interpreter. This
meant that the code we were running had to continue to have the same
preconditions (expected types, no method redefinitions, etc) or the JIT would
safely abort. &lt;strong&gt;Now,&lt;/strong&gt; we can side-exit and use this feature liberally.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, we gracefully handle the phase transition from integer to string;
a guard instruction fails and transfers control to the interpreter.&lt;/p&gt;

  &lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;add&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;three&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;four&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;
&lt;/blockquote&gt;

&lt;p&gt;This enables running a lot more code!&lt;/p&gt;

&lt;h3 id=&quot;more-code&quot;&gt;More code&lt;/h3&gt;

&lt;p&gt;Back in May, we could only run a handful of small benchmarks. &lt;strong&gt;Now,&lt;/strong&gt; we can
run all sorts of code, including passing the full Ruby test suite, the test
suite and shadow traffic of a large application at Shopify, and the test suite
of GitHub.com! Also a bank, apparently.&lt;/p&gt;

&lt;p&gt;Back in May, we did not optimize much; we only really optimized operations
on fixnums (small integers) and method sends to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;main&lt;/code&gt; object. &lt;strong&gt;Now,&lt;/strong&gt;
we optimize a lot more: all sorts of method sends, instance variable reads
and writes, attribute accessor/reader/writer use, struct reads and writes,
object allocations, certain string operations, optional parameters, and more.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, we can &lt;a href=&quot;https://en.wikipedia.org/wiki/Constant_folding&quot;&gt;constant-fold&lt;/a&gt; numeric operations. Because we also have a
(small, limited) inliner borrowed from YJIT, we can constant-fold the entirety
of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add&lt;/code&gt; down to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;3&lt;/code&gt;—and still handle redefinitions of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;one&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;two&lt;/code&gt;,
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Integer#+&lt;/code&gt;, …&lt;/p&gt;

  &lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;one&lt;/span&gt;
  &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;two&lt;/span&gt;
  &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;one&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;two&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;register-spilling&quot;&gt;Register spilling&lt;/h3&gt;

&lt;p&gt;Back in May, we could not compile many large functions due to limitations of
our backend that we borrowed from YJIT. &lt;strong&gt;Now,&lt;/strong&gt; we can compile absolutely
enormous functions just fine. And quickly, too. Though we have not been
focusing specifically on compiler performance, we compile even large methods in
under a millisecond.&lt;/p&gt;

&lt;h3 id=&quot;c-methods&quot;&gt;C methods&lt;/h3&gt;

&lt;p&gt;Back in May, we could not even optimize calls to built-in C methods. &lt;strong&gt;Now,&lt;/strong&gt;
we have a feature similar to JavaScriptCore’s DOMJIT, which allows us to emit
inline HIR versions of certain well-known C methods. This allows the optimizer
to reason about these methods and their effects (more on this in a future post)
much more… er, effectively.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Integer#succ&lt;/code&gt;, which is defined as adding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;1&lt;/code&gt; to an integer, is a
C method. It’s used in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Integer#times&lt;/code&gt; to drive the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;while&lt;/code&gt; loop. Instead of
emitting a call to it, our C method “inliner” can emit our existing &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;FixnumAdd&lt;/code&gt;
instruction and take advantage of the rest of the type inference and
constant-folding.&lt;/p&gt;

  &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;inline_integer_succ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InsnId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                       &lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InsnId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt;
                       &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InsnId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;Option&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;InsnId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.is_empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;None&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.likely_a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;types&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.coerce_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;types&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Fixnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push_insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;val&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Const&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;Value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;VALUE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;fixnum_from_usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fun&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.push_insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;hir&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FixnumAdd&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;left&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;right&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Some&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;nb&quot;&gt;None&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;fewer-c-calls&quot;&gt;Fewer C calls&lt;/h3&gt;

&lt;p&gt;Back in May, the machine code ZJIT generated called a lot of C functions from
the CRuby runtime to implement our HIR instructions in LIR. We have pared this
down significantly and now “open code” the implementations in LIR.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;GuardNotFrozen&lt;/code&gt; used to call out to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;rb_obj_frozen_p&lt;/code&gt;. Now, it
requires that its input is a heap-allocated object and can instead do a load, a
test, and a conditional jump.&lt;/p&gt;

  &lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;gen_guard_not_frozen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;JITState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                        &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Assembler&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                        &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Opnd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                        &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FrameState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Opnd&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// It&apos;s a heap object, so check the frozen flag&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.load&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;Opnd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;mem&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;RUBY_OFFSET_RBASIC_FLAGS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.test&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;RUBY_FL_FREEZE&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;as&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.into&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// Side-exit if frozen&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;asm&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.jnz&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;side_exit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;state&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GuardNotFrozen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;recv&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;
&lt;/blockquote&gt;

&lt;h3 id=&quot;more-teammates&quot;&gt;More teammates&lt;/h3&gt;

&lt;p&gt;Back in May, we had four people working full-time on the compiler. &lt;strong&gt;Now,&lt;/strong&gt; we
have more internally at Shopify—and also more from the community! We have
had several interested people reach out, learn about ZJIT, and successfully
land complex changes. For this reason, we have opened up &lt;a href=&quot;https://zjit.zulipchat.com&quot;&gt;a chat
room&lt;/a&gt; to discuss and improve ZJIT.&lt;/p&gt;

&lt;h3 id=&quot;a-cool-graph-visualization-tool&quot;&gt;A cool graph visualization tool&lt;/h3&gt;

&lt;p&gt;You &lt;em&gt;have to&lt;/em&gt; check out our intern Aiden’s &lt;a href=&quot;https://railsatscale.com/2025-11-19-adding-iongraph-support/&quot;&gt;integration of Iongraph into
ZJIT&lt;/a&gt;. Now we
have clickable, zoomable, scrollable graphs of all our functions and all our
optimization passes. It’s great!&lt;/p&gt;

&lt;p&gt;Try zooming (Ctrl-scroll), clicking the different optimization passes on the
left, clicking the instruction IDs in each basic block (definitions and uses),
and seeing how the IR for the below Ruby code changes over time.&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Point&lt;/span&gt;
  &lt;span class=&quot;nb&quot;&gt;attr_accessor&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:y&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;initialize&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;no&quot;&gt;P&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Point&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;freeze&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;P&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;y&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;iframe title=&quot;Iongraph Viewer&quot; aria-label=&quot;Interactive compiler graph visualization&quot; src=&quot;/assets/html/zjit-viewer.html&quot; width=&quot;100%&quot; height=&quot;400&quot;&gt;&lt;/iframe&gt;

&lt;h3 id=&quot;more&quot;&gt;More&lt;/h3&gt;

&lt;p&gt;…and so, so many garbage collection fixes.&lt;/p&gt;

&lt;p&gt;There’s still a lot to do, though.&lt;/p&gt;

&lt;h2 id=&quot;to-do&quot;&gt;To do&lt;/h2&gt;

&lt;p&gt;We’re going to optimize &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;invokeblock&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;yield&lt;/code&gt;) and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;invokesuper&lt;/code&gt; (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;super&lt;/code&gt;)
instructions, each of which behaves similarly, but not identically, to a
normal &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;send&lt;/code&gt; instruction. These are pretty common.&lt;/p&gt;

&lt;p&gt;We’re going to optimize &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;setinstancevariable&lt;/code&gt; in the case where we have to
transition the object’s shape. This will help normal &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@a = b&lt;/code&gt; situations. It
will also help &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@a ||= b&lt;/code&gt;, but I think we can even do better with the latter
using some kind of value numbering.&lt;/p&gt;

&lt;p&gt;We only optimize monomorphic calls right now—cases where a method send only
sees one class of receiver while being profiled. We’re going to optimize
polymorphic sends, too. Right now we’re laying the groundwork (a new register
allocator; see below) to make this much easier. It’s not as much of an
immediate focus, though, because most (high 80s, low 90s percent) of sends are
monomorphic. &lt;!-- TODO throwback to Smalltalk-80 --&gt;&lt;/p&gt;

&lt;p&gt;We’re in the middle of re-writing the register allocator after reading the
entire history of linear scan papers and several implementations. That will
unlock performance improvements and also allow us to make the IRs easier to
use.&lt;/p&gt;

&lt;p&gt;We don’t handle phase changes particularly well yet; if your method call
patterns change significantly after your code has been compiled, we will
frequently side-exit into the interpreter. Instead, we would like to use these
side-exits as additional profile information and re-compile the function.&lt;/p&gt;

&lt;p&gt;Right now we have a lot of traffic to the VM frame. JIT frame pushes are
reasonably fast, but with every effectful operation, we have to flush our local
variable state and stack state to the VM frame. The instances in which code
might want to read this reified frame state are rare: frame unwinding due to
exceptions, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Binding#local_variable_get&lt;/code&gt;, etc. In the future, we will instead
defer writing this state until it needs to be read.&lt;/p&gt;

&lt;p&gt;We only have a limited inliner that inlines constants, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;self&lt;/code&gt;, and parameters.
In the fullness of time, we will add a general-purpose method inlining
facility. This will allow us to reduce the amount of polymorphic sends, do some
branch folding, and reduce the amount of method sends.&lt;/p&gt;

&lt;p&gt;We only support optimizing positional parameters, required keyword parameters,
and optional parameters right now but we will work on optimizing optional
keyword arguments as well. Most of this work is in marshaling the complex
Ruby calling convention into one coherent form that the JIT can understand.&lt;/p&gt;

&lt;h2 id=&quot;performance&quot;&gt;Performance&lt;/h2&gt;

&lt;p&gt;We have public performance numbers for a selection of macro- and
micro-benchmarks on &lt;a href=&quot;https://rubybench.github.io/&quot;&gt;rubybench&lt;/a&gt;. Here is a screenshot of what those
per-benchmark graphs look like. The Y axis is speedup multiplier vs the
interpreter and the X axis is time. Higher is better:&lt;/p&gt;

&lt;figure style=&quot;display: block; margin: 0 auto; max-width: 80%;&quot;&gt;
  &lt;img src=&quot;/assets/img/zjit-benchmark.png&quot; /&gt;
  &lt;figcaption&gt;A line chart of ZJIT performance on railsbench&amp;mdash;represented as a
  speedup multiplier when compared to the interpreter&amp;mdash;improving over
  time, passing interpreter performance, catching up to YJIT.&lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;You can see that we are improving performance on nearly all benchmarks over
time. Some of this comes from from optimizing in a similar way as YJIT does
today (e.g. specializing ivar reads and writes), and some of it is optimizing
in a way that takes advantage of ZJIT’s high-level IR (e.g. constant folding,
branch folding, more precise type inference).&lt;/p&gt;

&lt;p&gt;We are using both raw time numbers and also our internal performance counters
(e.g. number of calls to C functions from generated code) to drive
optimization.&lt;/p&gt;

&lt;h2 id=&quot;try-it-out&quot;&gt;Try it out&lt;/h2&gt;

&lt;p&gt;While Ruby now ships with ZJIT compiled into the binary by default, it is not
&lt;em&gt;enabled&lt;/em&gt; by default at run-time. Due to performance and stability, YJIT is
still the default compiler choice in Ruby 4.0.&lt;/p&gt;

&lt;p&gt;If you want to run your test suite with ZJIT to see what happens, you
absolutely can. Enable it by passing the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit&lt;/code&gt; flag or the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RUBY_ZJIT_ENABLE&lt;/code&gt; environment variable or calling &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;RubyVM::ZJIT.enable&lt;/code&gt; after
starting your application.&lt;/p&gt;

&lt;h2 id=&quot;on-yjit&quot;&gt;On YJIT&lt;/h2&gt;

&lt;p&gt;We devoted a lot of our resources this year to developing ZJIT. While we did
not spend much time on YJIT (outside of a great &lt;a href=&quot;https://railsatscale.com/2025-05-21-fast-allocations-in-ruby-3-5/&quot;&gt;allocation speed
up&lt;/a&gt;), YJIT isn’t going anywhere soon.&lt;/p&gt;

&lt;h2 id=&quot;thank-you&quot;&gt;Thank you&lt;/h2&gt;

&lt;p&gt;This compiler was made possible by contributions to your &lt;del&gt;PBS station&lt;/del&gt; open
source project from programmers like you. Thank you!&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Aaron Patterson&lt;/li&gt;
  &lt;li&gt;Abrar Habib&lt;/li&gt;
  &lt;li&gt;Aiden Fox Ivey&lt;/li&gt;
  &lt;li&gt;Alan Wu&lt;/li&gt;
  &lt;li&gt;Alex Rocha&lt;/li&gt;
  &lt;li&gt;André Luiz Tiago Soares&lt;/li&gt;
  &lt;li&gt;Benoit Daloze&lt;/li&gt;
  &lt;li&gt;Charlotte Wen&lt;/li&gt;
  &lt;li&gt;Daniel Colson&lt;/li&gt;
  &lt;li&gt;Donghee Na&lt;/li&gt;
  &lt;li&gt;Eileen Uchitelle&lt;/li&gt;
  &lt;li&gt;Étienne Barrié&lt;/li&gt;
  &lt;li&gt;Godfrey Chan&lt;/li&gt;
  &lt;li&gt;Goshanraj Govindaraj&lt;/li&gt;
  &lt;li&gt;Hiroshi SHIBATA&lt;/li&gt;
  &lt;li&gt;Hoa Nguyen&lt;/li&gt;
  &lt;li&gt;Jacob Denbeaux&lt;/li&gt;
  &lt;li&gt;Jean Boussier&lt;/li&gt;
  &lt;li&gt;Jeremy Evans&lt;/li&gt;
  &lt;li&gt;John Hawthorn&lt;/li&gt;
  &lt;li&gt;Ken Jin&lt;/li&gt;
  &lt;li&gt;Kevin Menard&lt;/li&gt;
  &lt;li&gt;Max Bernstein&lt;/li&gt;
  &lt;li&gt;Max Leopold&lt;/li&gt;
  &lt;li&gt;Maxime Chevalier-Boisvert&lt;/li&gt;
  &lt;li&gt;Nobuyoshi Nakada&lt;/li&gt;
  &lt;li&gt;Peter Zhu&lt;/li&gt;
  &lt;li&gt;Randy Stauner&lt;/li&gt;
  &lt;li&gt;Satoshi Tagomori&lt;/li&gt;
  &lt;li&gt;Shannon Skipper&lt;/li&gt;
  &lt;li&gt;Stan Lo&lt;/li&gt;
  &lt;li&gt;Takashi Kokubun&lt;/li&gt;
  &lt;li&gt;Tavian Barnes&lt;/li&gt;
  &lt;li&gt;Tobias Lütke&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;(via a lightly touched up &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;git log --pretty=&quot;%an&quot; zjit | sort -u&lt;/code&gt;)&lt;/p&gt;
</description>
            <pubDate>Wed, 24 Dec 2025 00:00:00 +0000</pubDate>
            <niceDate>December 24, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/launch-zjit/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/launch-zjit/</guid>
        </item>
        
        <item>
            <title>How to annotate JITed code for perf/samply</title>
            <description>&lt;p&gt;Brief one today. I got asked “does YJIT/ZJIT have support for [Linux] perf?”&lt;/p&gt;

&lt;p&gt;The answer is yes, and it also works with &lt;a href=&quot;https://github.com/mstange/samply&quot;&gt;samply&lt;/a&gt; (including on macOS!),
because both understand the &lt;a href=&quot;https://github.com/torvalds/linux/blob/516471569089749163be24b973ea928b56ac20d9/tools/perf/Documentation/jit-interface.txt&quot;&gt;perf map interface&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This is the entirety of the implementation in ZJIT&lt;sup id=&quot;fnref:hex&quot;&gt;&lt;a href=&quot;#fn:hex&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;register_with_perf&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iseq_name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start_ptr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;code_size&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;usize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;perf_map&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;format!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;/tmp/perf-{}.map&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;process&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;id&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;());&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;fs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;OpenOptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.create&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.append&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;true&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.open&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;perf_map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;debug!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Failed to open perf map file: {perf_map}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;io&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;BufWriter&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;Ok&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;writeln!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;{start_ptr:x} {code_size:x} zjit::{iseq_name}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;debug!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Failed to write {iseq_name} to perf map file: {perf_map}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Whenever you generate a function, append a one-line entry consisting of&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;START SIZE symbolname
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;/tmp/perf-{PID}.map&lt;/code&gt;. Per the Linux docs linked above,&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;START and SIZE are hex numbers without 0x.&lt;/p&gt;

  &lt;p&gt;symbolname is the rest of the line, so it could contain special characters.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You can now happily run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;perf record your_jit [...]&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;samply record your_jit
[...]&lt;/code&gt; and have JIT frames be named in the output. We hide this behind
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit-perf&lt;/code&gt; flag to avoid file I/O overhead when we don’t need it.&lt;/p&gt;

&lt;h2 id=&quot;there-is-also-the-jit-dump-interface&quot;&gt;There is also the JIT dump interface&lt;/h2&gt;

&lt;p&gt;Perf map is the older way to interact with perf: a newer, more complicated way
involves &lt;a href=&quot;https://theunixzoo.co.uk/blog/2025-09-14-linux-perf-jit.html&quot;&gt;generating a “dump” file&lt;/a&gt; and then &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;perf inject&lt;/code&gt;ing it.&lt;/p&gt;

&lt;!--

## There is also the JIT gdb interface

This is not strictly related but I want to figure it out

--&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:hex&quot;&gt;
      &lt;p&gt;We actually use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{:#x}&lt;/code&gt;, which I noticed today is wrong. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{:#x}&lt;/code&gt; leaves
in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;0x&lt;/code&gt;, and it shouldn’t; instead &lt;strong&gt;use &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{:x}&lt;/code&gt;&lt;/strong&gt;. &lt;a href=&quot;#fnref:hex&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Thu, 18 Dec 2025 00:00:00 +0000</pubDate>
            <niceDate>December 18, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/jit-perf-map/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/jit-perf-map/</guid>
        </item>
        
        <item>
            <title>A catalog of side effects</title>
            <description>&lt;p&gt;Optimizing compilers like to keep track of each IR instruction’s &lt;em&gt;effects&lt;/em&gt;. An
instruction’s effects vary wildly from having no effects at all, to writing a
specific variable, to completely unknown (writing all state).&lt;/p&gt;

&lt;p&gt;This post can be thought of as a continuation of &lt;a href=&quot;/blog/irs/&quot;&gt;What I talk about when I talk
about IRs&lt;/a&gt;, specifically the section talking about asking the right
questions. When we talk about effects, we should ask the right questions: not
&lt;em&gt;what opcode is this?&lt;/em&gt; but instead &lt;em&gt;what effects does this opcode have?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Different compilers represent and track these effects differently. I’ve been
thinking about how to represent these effects all year, so I have been doing
some reading. In this post I will give some summaries of the landscape of
approaches. Please feel free to suggest more.&lt;/p&gt;

&lt;h2 id=&quot;some-background&quot;&gt;Some background&lt;/h2&gt;

&lt;p&gt;Internal IR effect tracking is similar to the programming language notion of
algebraic effects in type systems, but internally, compilers keep track of
finer-grained effects. Effects such as “writes to a local variable”, “writes to
a list”, or “reads from the stack” indicate what instructions can be
re-ordered, duplicated, or removed entirely.&lt;/p&gt;

&lt;p&gt;For example, consider the following pseodocode for some made-up language that
stands in for a snippet of compiler IR:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some_var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;another_var&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;# ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The goal of effects is to communicate to the compiler if, for example, these two IR
instructions can be re-ordered. The second instruction &lt;em&gt;might&lt;/em&gt; write to a
location that the first one reads. But it also might not! This is about knowing
if &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;some_var&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;another_var&lt;/code&gt; &lt;em&gt;alias&lt;/em&gt;—if they are different names that
refer to the same object.&lt;/p&gt;

&lt;p&gt;We can sometimes answer that question directly, but often it’s cheaper to
compute an approximate answer: &lt;em&gt;could&lt;/em&gt; they even alias? It’s possible that
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;some_var&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;another_var&lt;/code&gt; have different types, meaning that (as long as you
have strict aliasing) the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Load&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Store&lt;/code&gt; operations that implement these
reads and writes by definition touch different locations. And if they look
at disjoint locations, there need not be any explicit order enforced.&lt;/p&gt;

&lt;p&gt;Different compilers keep track of this information differently. The null effect
analysis gives up and says “every instruction is maximally effectful” and
therefore “we can’t re-order or delete any instructions”. That’s probably fine
for a first stab at a compiler, where you will get a big speed up purely based
on strength reductions. Over-approximations of effects should always be
valid.&lt;/p&gt;

&lt;p&gt;But at some point you start wanting to do dead code elimination (DCE), or
common subexpression elimination (CSE), or loads/store elimination, or move
instructions around, and you start wondering how to represent effects. That’s
where I am right now. So here’s a catalog of different compilers I have looked
at recently.&lt;/p&gt;

&lt;p&gt;There are two main ways I have seen to represent effects: bitsets and heap
range lists. We’ll look at one example compiler for each, talk a bit about
tradeoffs, then give a bunch of references to other major compilers.&lt;/p&gt;

&lt;p&gt;We’ll start with &lt;a href=&quot;https://github.com/facebookincubator/cinder&quot;&gt;Cinder&lt;/a&gt;, a Python JIT, because that’s what I used to
work on.&lt;/p&gt;

&lt;h2 id=&quot;cinder&quot;&gt;Cinder&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/facebookincubator/cinder&quot;&gt;Cinder&lt;/a&gt; tracks heap effects for its high-level IR (HIR) in
&lt;a href=&quot;https://github.com/facebookincubator/cinderx/blob/8bf5af94e2792d3fd386ab25b1aeedae27276d50/cinderx/Jit/hir/instr_effects.h&quot;&gt;instr_effects.h&lt;/a&gt;. Pretty much everything happens in
the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memoryEffects(const Instr&amp;amp; instr)&lt;/code&gt; function, which is expected to know
everything about what effects the given instruction might have.&lt;/p&gt;

&lt;p&gt;The data representation is a bitset representation of a lattice called an
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AliasClass&lt;/code&gt; and that is defined in &lt;a href=&quot;https://github.com/facebookincubator/cinderx/blob/8bf5af94e2792d3fd386ab25b1aeedae27276d50/cinderx/Jit/hir/alias_class.h&quot;&gt;alias_class.h&lt;/a&gt;. Each
bit in the bitset represents a distinct location in the heap: reads from and
writes to each of these locations are guaranteed not to affect any of the other
locations.&lt;/p&gt;

&lt;p&gt;Here is the X-macro that defines it:&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#define HIR_BASIC_ACLS(X) \
  X(ArrayItem)            \
  X(CellItem)             \
  X(DictItem)             \
  X(FuncArgs)             \
  X(FuncAttr)             \
  X(Global)               \
  X(InObjectAttr)         \
  X(ListItem)             \
  X(Other)                \
  X(TupleItem)            \
  X(TypeAttrCache)        \
  X(TypeMethodCache)
&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;enum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BitIndexes&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#define ACLS(name) k##name##Bit,
&lt;/span&gt;    &lt;span class=&quot;n&quot;&gt;HIR_BASIC_ACLS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ACLS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#undef ACLS
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that each bit implicitly represents a set: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ListItem&lt;/code&gt; does not refer to a
&lt;em&gt;specific&lt;/em&gt; list index, but the infinite set of all possible list indices. It’s
&lt;em&gt;any&lt;/em&gt; list index. Still, every list index is completely disjoint from, say, every
entry in a global variable table.&lt;/p&gt;

&lt;p&gt;(And, to be clear, an object in a list might be the same as an object in a
global variable table. The objects themselves can alias. But the thing being
written to or read from, the thing &lt;em&gt;being side effected&lt;/em&gt;, is the container.)&lt;/p&gt;

&lt;p&gt;Like other bitset lattices, it’s possible to union the sets by or-ing the bits.
It’s possible to query for overlap by and-ing the bits.&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;c1&quot;&gt;// The union of two AliasClass&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;c1&quot;&gt;// The intersection (overlap) of two AliasClass&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AliasClass&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits_&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;bits_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;If this sounds familiar, it’s because (as the repo notes) it’s a similar idea
to Cinder’s &lt;a href=&quot;/blog/lattice-bitset/&quot;&gt;type lattice representation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Like other lattices, there is both a bottom element (no effects) and a top
element (all possible effects):&lt;/p&gt;

&lt;div class=&quot;language-c highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;cp&quot;&gt;#define HIR_OR_BITS(name) | k##name
&lt;/span&gt;
&lt;span class=&quot;cp&quot;&gt;#define HIR_UNION_ACLS(X)                           \
  &lt;/span&gt;&lt;span class=&quot;cm&quot;&gt;/* Bottom union */&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt;                                \
  X(Empty, 0)                                       \
  &lt;/span&gt;&lt;span class=&quot;cm&quot;&gt;/* Top union */&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt;                                   \
  X(Any, 0 HIR_BASIC_ACLS(HIR_OR_BITS))             \
  &lt;/span&gt;&lt;span class=&quot;cm&quot;&gt;/* Memory locations accessible by managed code */&lt;/span&gt;&lt;span class=&quot;cp&quot;&gt; \
  X(ManagedHeapAny, kAny &amp;amp; ~kFuncArgs)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Union operations naturally hit a fixpoint at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Any&lt;/code&gt; and intersection operations
naturally hit a fixpoint at &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Empty&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;All of this together lets the optimizer ask and answer questions such as:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;where might this instruction write?&lt;/li&gt;
  &lt;li&gt;(because CPython is reference counted and incref implies ownership) where
does this instruction borrow its input from?&lt;/li&gt;
  &lt;li&gt;do these two instructions’ write destinations overlap?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;and more.&lt;/p&gt;

&lt;p&gt;Let’s take a look at an (imaginary) IR version of the code snippet in the intro
and see what analyzing it might look like in the optimizer. Here is the fake
IR:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0: Tuple = ...
v1: List = ...
v2: Int[5] = ...
# v = some_var[0]
v3: Object = LoadTupleItem v0, 0
# another_var[0] = 5
StoreListItem v1, 0, v2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can imagine that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LoadTupleItem&lt;/code&gt; declares that it reads from the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;TupleItem&lt;/code&gt; heap and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StoreListItem&lt;/code&gt; declares that it writes to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ListItem&lt;/code&gt;
heap. Because tuple and list pointers cannot be casted into one another and
therefore cannot alias, these are
disjoint heaps in our bitset. Therefore &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ListItem &amp;amp; TupleItem == 0&lt;/code&gt;, therefore
these memory operations can never interfere! They can (for example) be
re-ordered arbitrarily.&lt;/p&gt;

&lt;p&gt;In Cinder, these memory effects could in the future be used for instruction
re-ordering, but they are today mostly used in two places: the refcount
insertion pass and DCE.&lt;/p&gt;

&lt;p&gt;DCE involves first finding the set of instructions that need to be kept around
because they are useful/important/have effects. So here is what the Cinder DCE
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;isUseful&lt;/code&gt; looks like:&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;isUseful&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Instr&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsTerminator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsSnapshot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;asDeoptBase&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;nullptr&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsPrimitiveBox&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt;
      &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;IsPhi&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;memoryEffects&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;instr&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;may_store&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AEmpty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There are some other checks in there but &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;memoryEffects&lt;/code&gt; is right there at the
core of it!&lt;/p&gt;

&lt;p&gt;Now that we have seen the bitset representation of effects and an
implementation in Cinder, let’s take a look at a different representation and
and an implementation in JavaScriptCore.&lt;/p&gt;

&lt;h2 id=&quot;javascriptcore&quot;&gt;JavaScriptCore&lt;/h2&gt;

&lt;p&gt;I keep coming back to &lt;a href=&quot;https://gist.github.com/pizlonator/cf1e72b8600b1437dda8153ea3fdb963&quot;&gt;How I implement SSA form&lt;/a&gt; by &lt;a href=&quot;http://www.filpizlo.com/&quot;&gt;Fil
Pizlo&lt;/a&gt;, one of the significant contributors to JavaScriptCore (JSC). In
particular, I keep coming back to the &lt;a href=&quot;https://gist.github.com/pizlonator/cf1e72b8600b1437dda8153ea3fdb963#uniform-effect-representation&quot;&gt;Uniform Effect
Representation&lt;/a&gt; section. This notion of “abstract heaps” felt
very… well, abstract. Somehow more abstract than the bitset representation.
The pre-order and post-order integer pair as a way to represent nested heap
effects just did not click.&lt;/p&gt;

&lt;p&gt;It didn’t make any sense until I actually went spelunking in JavaScriptCore and
found one of several implementations—because, you know, JSC is six compilers
in a trenchcoat&lt;sup&gt;[&lt;a href=&quot;https://en.wikipedia.org/wiki/Wikipedia:Citation_needed&quot;&gt;&lt;i&gt;citation needed&lt;/i&gt;&lt;/a&gt;]&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;DFG, B3, DOMJIT, and probably others all have their own abstract heap
implementations. We’ll look at DOMJIT mostly because it’s a smaller example and
also illustrates something else that’s interesting: builtins. We’ll come back
to builtins in a minute.&lt;/p&gt;

&lt;p&gt;Let’s take a lookat how DOMJIT structures its &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/989c9f9cd5b1f0c9606820e219ee51da32a34c6b/Source/WebCore/domjit/DOMJITAbstractHeapRepository.yaml&quot;&gt;abstract
heaps&lt;/a&gt;: a YAML file.&lt;/p&gt;

&lt;div class=&quot;language-yaml highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;na&quot;&gt;DOM&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;na&quot;&gt;Tree&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_firstChild&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_lastChild&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_parentNode&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_nextSibling&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_previousSibling&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Node_ownerDocument&lt;/span&gt;
        &lt;span class=&quot;na&quot;&gt;Document&lt;/span&gt;&lt;span class=&quot;pi&quot;&gt;:&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Document_documentElement&lt;/span&gt;
            &lt;span class=&quot;pi&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;Document_body&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It’s a hierarchy. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node_firstChild&lt;/code&gt; is a subheap of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node&lt;/code&gt; is a subheap of…
and so on. A write to any &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node_nextSibling&lt;/code&gt; is a write to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node&lt;/code&gt; is a write to
… Sibling heaps are unrelated: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node_firstChild&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node_lastChild&lt;/code&gt;, for
example, are disjoint.&lt;/p&gt;

&lt;p&gt;To get a feel for this, I wired up a &lt;a href=&quot;https://github.com/tekknolagi/tekknolagi.github.com/tree/main/assets/code/gen_bitset.rb&quot;&gt;simplified version&lt;/a&gt; of
ZJIT’s bitset generator (for &lt;em&gt;types!&lt;/em&gt;) to read a YAML document and generate a
bitset. It generated the following Rust code:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;mod&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Empty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0u64&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_documentElement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_body&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_documentElement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_firstChild&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_lastChild&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_nextSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_ownerDocument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_parentNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_previousSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_firstChild&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_lastChild&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_nextSibling&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_ownerDocument&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_parentNode&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_previousSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NumTypeBits&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;u64&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It’s not a fancy X-macro, but it’s a short and flexible Ruby script.&lt;/p&gt;

&lt;p&gt;Then I took the &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/989c9f9cd5b1f0c9606820e219ee51da32a34c6b/Source/WebCore/domjit/generate-abstract-heap.rb&quot;&gt;DOMJIT abstract heap
generator&lt;/a&gt;—also funnily enough a short Ruby
script—modified the output format slightly, and had it generate its int
pairs:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;mod&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bits&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
  &lt;span class=&quot;cm&quot;&gt;/* DOMJIT Abstract Heap Tree.
  DOM&amp;lt;0,8&amp;gt;:
      Tree&amp;lt;0,8&amp;gt;:
          Node&amp;lt;0,6&amp;gt;:
              Node_firstChild&amp;lt;0,1&amp;gt;
              Node_lastChild&amp;lt;1,2&amp;gt;
              Node_parentNode&amp;lt;2,3&amp;gt;
              Node_nextSibling&amp;lt;3,4&amp;gt;
              Node_previousSibling&amp;lt;4,5&amp;gt;
              Node_ownerDocument&amp;lt;5,6&amp;gt;
          Document&amp;lt;6,8&amp;gt;:
              Document_documentElement&amp;lt;6,7&amp;gt;
              Document_body&amp;lt;7,8&amp;gt;
  */&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tree&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_firstChild&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_lastChild&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_parentNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_nextSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_previousSibling&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node_ownerDocument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_documentElement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;pub&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Document_body&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It already comes with a little diagram, which is super helpful for readability.&lt;/p&gt;

&lt;p&gt;Any empty range(s) represent empty heap effects: if the start and end are the
same number, there are no effects. There is no one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Empty&lt;/code&gt; value, but any empty
range could be normalized to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HeapRange { start: 0, end: 0 }&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Maybe this was obvious to you, dear reader, but this pre-order/post-order thing
is about nested ranges! Seeing the output of the generator laid out clearly
like this made it make a lot more sense for me.&lt;/p&gt;

&lt;!--
So how do we compute subtyping relationships with `HeapRange`s? We check range
overlap! Here is [DOMJIT&apos;s C++ implementation][domjit-is-subtype-of]:

[domjit-is-subtype-of]: https://github.com/WebKit/WebKit/blob/989c9f9cd5b1f0c9606820e219ee51da32a34c6b/Source/JavaScriptCore/domjit/DOMJITHeapRange.h#L99

```c++
class HeapRange {
    constexpr explicit operator bool() const {
        return m_begin != m_end;
    }

    bool isStrictSubtypeOf(const HeapRange&amp; other) const {
        if (!*this || !other)
            return false;
        if (*this == other)
            return false;
        return other.m_begin &lt;= m_begin &amp;&amp; m_end &lt;= other.m_end;
    }

    bool isSubtypeOf(const HeapRange&amp; other) const {
        if (!*this || !other)
            return false;
        if (*this == other)
            return true;
        return isStrictSubtypeOf(other);
    }
```

This is represented by the `operator bool()`
and implicit boolean conversions. To reinforce the whole nested heap ranges
thing, `isSubtypeOf` is asking if one `HeapRange` contains another.
--&gt;

&lt;p&gt;What about checking overlap? Here is the &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/989c9f9cd5b1f0c9606820e219ee51da32a34c6b/Source/JavaScriptCore/domjit/DOMJITHeapRange.h#L108&quot;&gt;implementation in
JSC&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;namespace&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WTF&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;c1&quot;&gt;// Check if two ranges overlap assuming that neither range is empty.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;constexpr&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nonEmptyRangesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ASSERT_UNDER_CONSTEXPR_CONTEXT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ASSERT_UNDER_CONSTEXPR_CONTEXT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c1&quot;&gt;// Pass ranges with the min being inclusive and the max being exclusive.&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;constexpr&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;rangesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ASSERT_UNDER_CONSTEXPR_CONTEXT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ASSERT_UNDER_CONSTEXPR_CONTEXT&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;// Empty ranges interfere with nothing.&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nonEmptyRangesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;leftMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;leftMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rightMax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;HeapRange&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;overlaps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WTF&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rangesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m_begin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m_begin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m_end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(See also &lt;a href=&quot;https://zayenz.se/blog/post/how-to-check-for-overlapping-intervals/&quot;&gt;How to check for overlapping intervals&lt;/a&gt; and
&lt;a href=&quot;https://nedbatchelder.com/blog/201310/range_overlap_in_two_compares.html&quot;&gt;Range overlap in two compares&lt;/a&gt; for more fun.)&lt;/p&gt;

&lt;p&gt;While bitsets are a dense representation (you have to hold every bit), they are
very compact and they are very precise. You can hold any number of combinations
of 64 or 128 bits in a single register. The union and intersection operations
are very cheap.&lt;/p&gt;

&lt;p&gt;With int ranges, it’s a little more complicated. An imprecise union of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; and
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt; can take the maximal range that covers both &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt;. To get a more
precise union, you have to keep track of both. In the worst case, if you want
efficient arbitrary queries, you need to store your int ranges in an interval
tree. So what gives?&lt;/p&gt;

&lt;p&gt;I asked Fil if both bitsets and int ranges answer the same question, why use
int ranges? He said that it’s more flexible long-term: bitsets get expensive as
soon as you need over 128 bits (you might need to heap allocate them!) whereas
ranges have no such ceiling. But doesn’t holding sequences of ranges require
heap allocation? Well, despite Fil writing this in his SSA post:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The purpose of the effect representation baked into the IR is to provide a
precise always-available baseline for alias information that is super easy to
work with. […] you can have instructions report that they read/write
multiple heaps […] you can have a utility function that produces such lists
on demand.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It’s important to note that this doesn’t actually involve any allocation of
lists. JSC does this very clever thing where they have “functors” that they
pass in as arguments that compress/summarize what they want to out of an
instruction’s effects.&lt;/p&gt;

&lt;p&gt;Let’s take a look at how the DFG (for example) uses these heap ranges in
analysis. The DFG is structured in such a way that it can make use of the
DOMJIT heap ranges directly, which is neat.&lt;/p&gt;

&lt;p&gt;Note that &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AbstractHeap&lt;/code&gt; in the example below is a thin wrapper over the DFG
compiler’s own &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;DOMJIT::HeapRange&lt;/code&gt; equivalent:&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;AbstractHeapOverlaps&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
&lt;span class=&quot;nl&quot;&gt;public:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;AbstractHeapOverlaps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;false&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;operator&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;otherHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;overlaps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;otherHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

    &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;private&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;mutable&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;m_result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;

&lt;span class=&quot;kt&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;writesOverlap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Graph&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;NoOpClobberize&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;noOp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;AbstractHeapOverlaps&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addWrite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;clobberize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;noOp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addWrite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;noOp&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;addWrite&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;clobberize&lt;/code&gt; is the function that calls these functors (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;noOp&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;addWrite&lt;/code&gt; in
this case) for each effect that the given IR instruction &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;node&lt;/code&gt; declares.&lt;/p&gt;

&lt;p&gt;I’ve pulled some relevant snippets of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;clobberize&lt;/code&gt;, which is quite long, that I
think are interesting.&lt;/p&gt;

&lt;p&gt;First, some instructions (constants, here) have no effects. There’s some
utility in the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;def(PureValue(...))&lt;/code&gt; call but I didn’t understand fully.&lt;/p&gt;

&lt;p&gt;Then there are some instructions that conditionally have effects depending on
the use types of their operands.&lt;sup id=&quot;fnref:dfg-use-type&quot;&gt;&lt;a href=&quot;#fn:dfg-use-type&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; Taking the absolute value of an
Int32 or a Double is effect-free but otherwise looks like it can run arbitrary
code.&lt;/p&gt;

&lt;p&gt;Some run-time IR guards that might cause side exits are annotated as
such—they write to the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SideState&lt;/code&gt; heap.&lt;/p&gt;

&lt;p&gt;Local variable instructions read &lt;em&gt;specific&lt;/em&gt; heaps indexed by what looks like
the local index but I’m not sure. This means accessing two different locals
won’t alias!&lt;/p&gt;

&lt;p&gt;Instructions that allocate can’t be re-ordered, it looks like; they both read
and write the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HeapObjectCount&lt;/code&gt;. This probably limits the amount of allocation
sinking that can be done.&lt;/p&gt;

&lt;p&gt;Then there’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CallDOM&lt;/code&gt;, which is the builtins stuff I was talking about. We’ll
come back to that after the code block.&lt;/p&gt;

&lt;div class=&quot;language-c++ highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;template&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ReadFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;WriteFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;DefFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;typename&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;ClobberTopFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;&amp;gt;&lt;/span&gt;
&lt;span class=&quot;kt&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;clobberize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Graph&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ReadFunctor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;WriteFunctor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DefFunctor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;switch&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;JSConstant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DoubleConstant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Int52Constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PureValue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;constant&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ArithAbs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;child1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;useKind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Int32Use&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;child1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;().&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;useKind&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DoubleRepUse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PureValue&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;arithMode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;clobberTop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AssertInBounds&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AssertNotEmpty&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SideState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GetLocal&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Stack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapLocation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;StackLoc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Stack&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;LazyNode&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;));&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewArrayWithSize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NewArrayWithSizeAndStructure&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapObjectCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapObjectCount&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CallDOM&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;const&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Signature&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;node&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Effect&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;signature&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;top&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;World&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;read&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DOMState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reads&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rawRepresentation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;writes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;writes&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;top&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
                &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Options&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;validateDFGClobberize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
                    &lt;span class=&quot;n&quot;&gt;clobberTopFunctor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Heap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
                &lt;span class=&quot;nf&quot;&gt;write&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractHeap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DOMState&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;writes&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rawRepresentation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()));&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ASSERT_WITH_MESSAGE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;effect&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DOMJIT&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;HeapRange&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;top&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;Currently, we do not accept any def for CallDOM.&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(Remember that these &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;AbstractHeap&lt;/code&gt; operations are very similar to DOMJIT’s
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HeapRange&lt;/code&gt; with a couple more details—and in some cases even contain DOMJIT
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;HeapRange&lt;/code&gt;s!)&lt;/p&gt;

&lt;p&gt;This &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;CallDOM&lt;/code&gt; node is the way for the DOM APIs in the browser—a significant
chunk of the builtins, which are written in C++—to communicate what they do
to the optimizing compiler. Without any annotations, the JIT has to assume that
a call into C++ could do anything to the JIT state. Bummer!&lt;/p&gt;

&lt;p&gt;But because, for example, &lt;a href=&quot;https://developer.mozilla.org/en-US/docs/Web/API/Node/firstChild&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Node.firstChild&lt;/code&gt;&lt;/a&gt; &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/32bda1b1d73527ba1d05ccba0aa8e463ddeac56d/Source/WebCore/domjit/JSNodeDOMJIT.cpp#L86&quot;&gt;annotates what
memory it reads from&lt;/a&gt; and what it &lt;em&gt;doesn’t&lt;/em&gt; write to,
the JIT can optimize around it better—or even remove the access completely.
It means the JIT can reason about calls to known builtins &lt;em&gt;the same way&lt;/em&gt; that
it reasons about normal JIT opcodes.&lt;/p&gt;

&lt;p&gt;(Incidentally it looks like it doesn’t even make a C call, but instead is
inlined as a little memory read snippet using a JIT builder API. Neat.)&lt;/p&gt;

&lt;!-- TODO tie it back to the original example --&gt;

&lt;!--
B3 from JSC
https://github.com/WebKit/WebKit/blob/main/Source/JavaScriptCore/b3/B3Effects.h
https://github.com/WebKit/WebKit/blob/5811a5ad27100acab51f1d5ba4518eed86bbf00b/Source/JavaScriptCore/b3/B3AbstractHeapRepository.h

DOMJIT from JSC
https://github.com/WebKit/WebKit/blob/main/Source/WebCore/domjit/generate-abstract-heap.rb
generates from https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/WebCore/domjit/DOMJITAbstractHeapRepository.yaml#L4

DFG from JSC
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGAbstractHeap.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGClobberize.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGClobberize.cpp
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGClobberize.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGStructureAbstractValue.cpp
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGStructureAbstractValue.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGClobberSet.h
https://github.com/WebKit/WebKit/blob/b99cb96a7a3e5978b475d2365b72196e15a1a326/Source/JavaScriptCore/dfg/DFGStructureAbstractValue.h
--&gt;

&lt;p&gt;Last, we’ll look at Simple, which has a slightly different take on all of this.&lt;/p&gt;

&lt;h2 id=&quot;simple&quot;&gt;Simple&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/seaofnodes/simple&quot;&gt;Simple&lt;/a&gt; is Cliff Click’s pet Sea of
Nodes (SoN) project to try and showcase the idea to the world—outside of a
HotSpot C2 context.&lt;/p&gt;

&lt;p&gt;This one is a little harder for me to understand but it looks like each
translation unit has a &lt;a href=&quot;https://github.com/SeaOfNodes/Simple/blob/1426384fc7d0e9947e38ad6d523a5e53c324d710/chapter10/src/main/java/com/seaofnodes/simple/node/StartNode.java#L33&quot;&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;StartNode&lt;/code&gt;&lt;/a&gt; that doles out
different classes of memory nodes for each alias class. Each IR node then takes
data dependencies on whatever effect nodes it might uses.&lt;/p&gt;

&lt;p&gt;Alias classes are split up based on the paper &lt;a href=&quot;/assets/img/tbaa.pdf&quot;&gt;Type-Based Alias Analysis&lt;/a&gt;
(PDF): “Our approach is a form of TBAA similar to the ‘FieldTypeDecl’ algorithm
described in the paper.”&lt;/p&gt;

&lt;!--

Cliff Click says:

All effects are represented as edges in the graph, the same edges as normal value flows, and all edges in Simple/C2 are simple pointers (and hence are unlabeled).

StartNode produces all effects and StopNode consumes them; same for Call and CallEnd.
Effects, being just another form of value, can be merged in PhiNodes.
Effects are generally split into smaller disjoint pieces, and recombined before Stop/Call.  Splitting into disjoint pieces allows more precision in the IR, and so more optimizations.
The common first split is the Memory effect from all other effects.  Other effects are generally some form of abstract i/o (all file system operations, reading/writing device controller memory, all external calls to disjoint address spaces, etc), or control.  Control is Just Another Edge denoting normal control flow, and e.g. data ops that depend on a prior control op use it to guard for safety.  Things like div-by-0, or null-ptr-check, or array-index-OOB are all done with a control edge to the guarding test.

Memory effects are further split into disjoint aliases; operations in one alias class can never overlap with another (this is a Y/N choice, not a may/must choice).  These aliases are equivalence classes; all mem ops belong in exactly one class, and the set of classes exactly partitions all of memory.  Common splits are fields in a struct (no &apos;f&apos; field ever overlaps with any &apos;g&apos; field), or kinds of arrays (no int[] overlaps with a flt[]).

In this example a = l[0]; l[0] = 5, we might have as IR:

a = Load(ctrl-for-AIOOB, mem-for-int[], offset);
mem-for-int[] = Store(ctrl-for-AIOOB, mem-for-int[], offset, 5)



Note that the Load and Store are not ordered here.  This Store IS ordered against all other int[] Stores.
The serializing algo Global Code Motion will add an anti-dep as needed, and then order the Load &amp; Store.

Splitting is basically by having a &quot;narrow&quot; user read from a &quot;fat memory&quot;.  Narrow, because its using a single alias and is one of the memops (e.g. Loads and Stores).  A &quot;fat memory&quot; always comes from Start &amp; CallEnd.  A MemMerge can merge a bunch of narrow aliases (and one fat) and make a fat memory.  Basically its all done lazily by &quot;doing nothing&quot;, and requiring the graph builder not produce a junk graph.

Splitting happens when the Parser decides you are manipulating a slice.
THere are some peephole&apos;s for widening the split region over a larger area, allowing more memory optimizations in the larger wider area.
Load &amp; Stores have a peep to move &quot;up past&quot; a MemMerge on the correct alias edge.
--&gt;

&lt;p&gt;The Simple project is structured into sequential implementation stages and
alias classes come into the picture in &lt;a href=&quot;https://github.com/SeaOfNodes/Simple/tree/main/chapter10&quot;&gt;Chapter 10&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Because I spent a while spelunking through other implementations to see how
other projects did this, here is a list of the projects I looked at. Mostly,
they use bitsets.&lt;/p&gt;

&lt;h2 id=&quot;other-implementations&quot;&gt;Other implementations&lt;/h2&gt;

&lt;h3 id=&quot;hhvm&quot;&gt;HHVM&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/facebook/hhvm&quot;&gt;HHVM&lt;/a&gt;, a JIT for the
&lt;a href=&quot;https://hacklang.org/&quot;&gt;Hack&lt;/a&gt; language, also uses a bitset for its memory
effects. See for example: &lt;a href=&quot;https://github.com/facebook/hhvm/blob/0395507623c2c08afc1d54c0c2e72bc8a3bd87f1/hphp/runtime/vm/jit/alias-class.h&quot;&gt;alias-class.h&lt;/a&gt; and
&lt;a href=&quot;https://github.com/facebook/hhvm/blob/0395507623c2c08afc1d54c0c2e72bc8a3bd87f1/hphp/runtime/vm/jit/memory-effects.h&quot;&gt;memory-effects.h&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;HHVM has a couple places that use this information, such as &lt;a href=&quot;https://github.com/facebook/hhvm/blob/4cdb85bf737450bf6cb837d3167718993f9170d7/hphp/runtime/vm/jit/def-sink.cpp&quot;&gt;a
definition-sinking pass&lt;/a&gt;, &lt;a href=&quot;https://github.com/facebook/hhvm/blob/0395507623c2c08afc1d54c0c2e72bc8a3bd87f1/hphp/runtime/vm/jit/alias-analysis.h&quot;&gt;alias
analysis&lt;/a&gt;, &lt;a href=&quot;https://github.com/facebook/hhvm/blob/4cdb85bf737450bf6cb837d3167718993f9170d7/hphp/runtime/vm/jit/dce.cpp&quot;&gt;DCE&lt;/a&gt;, &lt;a href=&quot;https://github.com/facebook/hhvm/blob/4cdb85bf737450bf6cb837d3167718993f9170d7/hphp/runtime/vm/jit/store-elim.cpp&quot;&gt;store
elimination&lt;/a&gt;, &lt;a href=&quot;https://github.com/facebook/hhvm/blob/1f9eda80656b79634b6956084481ed5a43d8bc2e/hphp/runtime/vm/jit/refcount-opts.cpp&quot;&gt;refcount opts&lt;/a&gt;, and
more.&lt;/p&gt;

&lt;p&gt;If you are wondering why the HHVM representation looks similar to the Cinder
representation, it’s because some former HHVM engineers such as Brett Simmers
also worked on Cinder!&lt;/p&gt;

&lt;h3 id=&quot;android-art&quot;&gt;Android ART&lt;/h3&gt;

&lt;p&gt;(note that I am linking an ART fork on GitHub as a reference, but the upstream
code is &lt;a href=&quot;https://android.googlesource.com/platform/art/+/refs/heads/main/compiler/optimizing/nodes.h&quot;&gt;hosted on googlesource&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Android’s &lt;a href=&quot;https://source.android.com/docs/core/runtime&quot;&gt;ART Java runtime&lt;/a&gt; also
uses a bitset for its effect representation. It’s a very compact class called
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SideEffects&lt;/code&gt; in &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/nodes.h#L1602&quot;&gt;nodes.h&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The side effects are used in &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/licm.cc#L104&quot;&gt;loop-invariant code motion&lt;/a&gt;, &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/gvn.cc#L204&quot;&gt;global
value numbering&lt;/a&gt;, &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/write_barrier_elimination.cc#L45&quot;&gt;write barrier
elimination&lt;/a&gt;, &lt;a href=&quot;https://github.com/LineageOS/android_art/blob/c09a5c724799afdc5f89071b682b181c0bd23099/compiler/optimizing/scheduler.cc#L55&quot;&gt;scheduling&lt;/a&gt;,
and more.&lt;/p&gt;

&lt;h3 id=&quot;netcoreclr&quot;&gt;.NET/CoreCLR&lt;/h3&gt;

&lt;p&gt;CoreCLR mostly &lt;a href=&quot;https://github.com/dotnet/runtime/blob/a0878687d02b42034f4ea433ddd7a72b741510b8/src/coreclr/jit/sideeffects.h#L169&quot;&gt;uses a bitset&lt;/a&gt; for its &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SideEffectSet&lt;/code&gt;
class. This one is interesting though because it also splits out effects
specifically to include sets of local variables (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;LclVarSet&lt;/code&gt;).&lt;/p&gt;

&lt;h3 id=&quot;v8&quot;&gt;V8&lt;/h3&gt;

&lt;p&gt;V8 is also about six completely different compilers in a trenchcoat.&lt;/p&gt;

&lt;p&gt;Turboshaft uses a struct in &lt;a href=&quot;https://github.com/v8/v8/blob/e817fdf31a2947b2105bd665067d92282e4b4d59/src/compiler/turboshaft/operations.h#L577&quot;&gt;operations.h&lt;/a&gt; called
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;OpEffects&lt;/code&gt; which is two bitsets for reads/writes of effects. This is used in
&lt;a href=&quot;https://github.com/v8/v8/blob/42f5ff65d12f0ef9294fa7d3875feba938a81904/src/compiler/turboshaft/value-numbering-reducer.h#L164&quot;&gt;value numbering&lt;/a&gt; as well a bunch of
other small optimization passes they call “reducers”.&lt;/p&gt;

&lt;p&gt;Maglev also has this thing called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;NodeT::kProperties&lt;/code&gt; in &lt;a href=&quot;https://github.com/v8/v8/blob/42f5ff65d12f0ef9294fa7d3875feba938a81904/src/maglev/maglev-ir.h&quot;&gt;their IR
nodes&lt;/a&gt; that also looks like a bitset and is used in their various
reducers. It has effect query methods on it such as &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;can_eager_deopt&lt;/code&gt; and
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;can_write&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Until recently, V8 also used Sea of Nodes as its IR representation, which also
tracks side effects more explicitly in the structure of the IR itself.&lt;/p&gt;

&lt;h2 id=&quot;guile&quot;&gt;Guile&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://www.gnu.org/software/guile/&quot;&gt;Guile Scheme&lt;/a&gt; looks like it has a &lt;a href=&quot;https://wingolog.org/archives/2014/05/18/effects-analysis-in-guile&quot;&gt;custom tagging
scheme&lt;/a&gt; type thing.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Both bitsets and int ranges are perfectly cromulent ways of representing heap
effects for your IR. The Sea of Nodes approach is also probably okay since it
powers HotSpot C2 and (for a time) V8.&lt;/p&gt;

&lt;p&gt;Remember to ask &lt;em&gt;the right questions&lt;/em&gt; of your IR when doing analysis.&lt;/p&gt;

&lt;h2 id=&quot;thank-you&quot;&gt;Thank you&lt;/h2&gt;

&lt;p&gt;Thank you to &lt;a href=&quot;http://www.filpizlo.com/&quot;&gt;Fil Pizlo&lt;/a&gt; for writing his initial
GitHub Gist and sending me on this journey and thank you to &lt;a href=&quot;https://www.chrisgregory.me/&quot;&gt;Chris
Gregory&lt;/a&gt;, Brett Simmers, and &lt;a href=&quot;https://ufuk.dev/&quot;&gt;Ufuk
Kayserilioglu&lt;/a&gt; for feedback on making some of the
explanations more helpful.&lt;/p&gt;

&lt;!--

TODO Dart
https://github.com/dart-lang/sdk/blob/59905c43f1a0394394ad5545ee439bcba63dea55/runtime/vm/constants_riscv.h#L968
https://github.com/dart-lang/sdk/blob/59905c43f1a0394394ad5545ee439bcba63dea55/runtime/vm/compiler/backend/redundancy_elimination.cc#L758
https://github.com/dart-lang/sdk/blob/59905c43f1a0394394ad5545ee439bcba63dea55/runtime/vm/compiler/backend/redundancy_elimination.cc#L1096

ChakraCore
https://github.com/chakra-core/ChakraCore/blob/2dba810c925eb366e44a1f7d7a5b2e289e2f8510/lib/Runtime/Types/RecyclableObject.h#L172

SpiderMonkey
https://github.com/servo/mozjs/blob/77645ed41f588297fd8d7edaee71500f4c83d070/mozjs-sys/mozjs/js/src/jit/MIR.h#L935
https://github.com/servo/mozjs/blob/77645ed41f588297fd8d7edaee71500f4c83d070/mozjs-sys/mozjs/js/src/jit/MIR.h#L9658

Cinder LIR
https://github.com/facebookincubator/cinderx/blob/main/cinderx/Jit/lir/instruction.h

HotSpot C1

HotSpot C2

PyPy
https://github.com/pypy/pypy/blob/main/rpython/jit/codewriter/effectinfo.py
https://github.com/pypy/pypy/blob/main/rpython/jit/metainterp/optimizeopt/heap.py#L59

LLVM
https://llvm.org/docs/LangRef.html#tbaa-metadata

LLVM MemorySSA
https://llvm.org/docs/MemorySSA.html

MLIR
https://mlir.llvm.org/docs/Rationale/SideEffectsAndSpeculation/

MEMOIR
https://conf.researchr.org/details/cgo-2024/cgo-2024-main-conference/31/Representing-Data-Collections-in-an-SSA-Form

Scala LMS graph IR
https://2023.splashcon.org/details/splash-2023-oopsla/46/Graph-IRs-for-Impure-Higher-Order-Languages-Making-Aggressive-Optimizations-Affordab

MIR and borrow checker
https://rustc-dev-guide.rust-lang.org/part-3-intro.html#source-code-representation

&gt; &quot;Fabrice Rastello, Florent Bouchez Tichadou (2022) SSA-based Compiler Design&quot;--most (all?) chapters in Part III, Extensions, are pretty much motivated by doing alias analysis in some way

Intermediate Representations in Imperative Compilers: A Survey
http://kameken.clique.jp/Lectures/Lectures2013/Compiler2013/a26-stanier.pdf

Partitioned Lattice per Variable (PLV) -- that&apos;s in Chapter 13 on SSI

TODO maybe lattice in ascent

--&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:dfg-use-type&quot;&gt;
      &lt;p&gt;This is because the DFG compiler does this interesting thing
where they track and guard the input types on &lt;em&gt;use&lt;/em&gt; vs having types
attached to the input’s own &lt;em&gt;def&lt;/em&gt;. It might be a clean way to handle shapes
inside the type system while also allowing the type+shape of an object to
change over time (which it can do in many dynamic language runtimes). &lt;a href=&quot;#fnref:dfg-use-type&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Tue, 11 Nov 2025 00:00:00 +0000</pubDate>
            <niceDate>November 11, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/compiler-effects/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/compiler-effects/</guid>
        </item>
        
        <item>
            <title>Walking around the compiler</title>
            <description>&lt;p&gt;Walking around outside is good for you.&lt;sup&gt;[&lt;a href=&quot;https://en.wikipedia.org/wiki/Wikipedia:Citation_needed&quot;&gt;&lt;i&gt;citation needed&lt;/i&gt;&lt;/a&gt;]&lt;/sup&gt;
A nice amble through the trees can quiet inner turbulence and make complex
engineering problems disappear.&lt;/p&gt;

&lt;p&gt;Vicki Boykis wrote a post, &lt;a href=&quot;https://vickiboykis.com/2025/09/09/walking-around-the-app/&quot;&gt;Walking around the
app&lt;/a&gt;, about a more
proverbial stroll. In it, she talks about constantly using your production
application’s interface to make sure the whole thing is cohesively designed
with few rough edges.&lt;/p&gt;

&lt;p&gt;She also talks about walking around other parts of the &lt;em&gt;implementation&lt;/em&gt; of the
application, fixing inconsistencies, complex machinery, and broken builds. Kind
of like picking up someone else’s trash on your hike.&lt;/p&gt;

&lt;p&gt;That’s awesome and universally good advice for pretty much every software
project. It got me thinking about how I walk around the compiler.&lt;/p&gt;

&lt;h2 id=&quot;what-does-your-output-look-like&quot;&gt;What does your output look like?&lt;/h2&gt;

&lt;p&gt;There’s a certain class of software project that transforms data—compression
libraries, compilers, search engines—for which there’s another layer of
“walking around” you can do. You have the code, yes, but you also have
&lt;em&gt;non-trivial output&lt;/em&gt;.&lt;/p&gt;

&lt;!-- TODO pick another term --&gt;

&lt;p&gt;By non-trivial, I mean an output that scales along some quality axis instead of
something semi-regular like a JSON response. For compression, it’s size. For
compilers, it’s generated code.&lt;/p&gt;

&lt;p&gt;You probably already have some generated cases checked into your codebase as
tests. That’s awesome. I think golden tests are fantastic for correctness and
for people to help understand. But this isolated understanding may not scale to
more complex examples.&lt;/p&gt;

&lt;p&gt;How &lt;em&gt;does&lt;/em&gt; your compiler handle, for example, switch-case statements in loops?
Does it do the jump threading you expect it to? Maybe you’re sitting there idly
wondering while you eat a cookie, but maybe that thought would only have
occurred to you while you were scrolling through the optimizer.&lt;/p&gt;

&lt;h3 id=&quot;an-example&quot;&gt;An example&lt;/h3&gt;

&lt;p&gt;Say you are &lt;a href=&quot;https://cfbolz.de/&quot;&gt;CF Bolz-Tereick&lt;/a&gt; and you are paging through
&lt;a href=&quot;https://pypy.org/&quot;&gt;PyPy&lt;/a&gt; IR. You notice some IR that looks like:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0 = ...
v1 = float_abs v0
...
v2 = float_abs v1
...
v3 = float_abs v2
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;“Huh”, you say to yourself, “surely the optimizer can reason that running
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;float_abs&lt;/code&gt; on the result of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;float_abs&lt;/code&gt; is redundant!”&lt;/p&gt;

&lt;p&gt;But some quirk in your optimizer means that it does not. Maybe it used to work,
or maybe it never did. But this little stroll revealed a bug with a quick fix
(adding a new peephole optimization function):&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;optimize_FLOAT_ABS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;get_box_replacement&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getarg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;arg_op&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;optimizer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as_operation&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_op&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;bp&quot;&gt;None&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;and&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;arg_op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;getopnum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;FLOAT_ABS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;make_equal_to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;emit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;op&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now, thankfully, your IR looks much better:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;v0 = ...
v1 = float_abs v0
...
...
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and you can check this in as a tidy test case:&lt;/p&gt;

&lt;div class=&quot;language-python highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;test_abs_abs_no&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;ops&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
    [f1]
    f2 = float_abs(f1)
    f3 = float_abs(f2)
    escape_f(f3)
    &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;expected&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;
    [f1]
    f2 = float_abs(f1)
    escape_f(f2)
    &lt;/span&gt;&lt;span class=&quot;sh&quot;&gt;&quot;&quot;&quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;optimize_loop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ops&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;expected&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Fun fact: this was my first exposure to the PyPy project. CF walked me through
fixing this bug&lt;sup id=&quot;fnref:actual-fix&quot;&gt;&lt;a href=&quot;#fn:actual-fix&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;1&lt;/a&gt;&lt;/sup&gt; live at ECOOP 2022! I had a great time.&lt;/p&gt;

&lt;h3 id=&quot;internal-state&quot;&gt;Internal state&lt;/h3&gt;

&lt;p&gt;If checking (and, later, testing) your assumptions is tricky, this may be a
sign that your library does not expose enough of its internal state to
developers. This may present a usability impediment that prevents you from
immediately checking your assumptions or suspicions.&lt;/p&gt;

&lt;p&gt;For an excellent source of inspiration, see &lt;a href=&quot;https://x.com/thingskatedid/status/1386077306381242371&quot;&gt;Kate’s tweets about program
internals&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Even if it does provide a flag like &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;--zjit-dump-hir&lt;/code&gt; to print to the console,
maybe this is hard to run from a phone&lt;sup id=&quot;fnref:log-off&quot;&gt;&lt;a href=&quot;#fn:log-off&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;2&lt;/a&gt;&lt;/sup&gt; or a friend’s computer. For
that, you may want &lt;em&gt;friendlier tools&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id=&quot;mechanical-sympathy-and-the-compiler-explorer&quot;&gt;Mechanical sympathy and the compiler explorer&lt;/h2&gt;

&lt;p&gt;The right kind of tool invites exploration.&lt;/p&gt;

&lt;p&gt;Matthew Godbolt built the first friendly compiler explorer tool I used, the
&lt;a href=&quot;https://godbolt.org/&quot;&gt;Compiler Explorer&lt;/a&gt; (“Godbolt”). It allows inputting
programs into your web browser in many different languages and immediately
seeing the compiled result. It will even execute your programs, within reason.&lt;/p&gt;

&lt;p&gt;This is a powerful tool:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;The feedback is near-instant and live updates on key-up.&lt;/li&gt;
  &lt;li&gt;There is no fussing with the command line and file watching.&lt;/li&gt;
  &lt;li&gt;Where possible, it highlights slices of source and compiled result to
indicate what regions produced what output.&lt;/li&gt;
  &lt;li&gt;It’s open source and you can add your own compiler.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This combination lowers the barrier to check things &lt;em&gt;tremendously&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Now, sometimes you want the reverse: a Compiler Explorer -like thing in your
terminal or editor so you don’t have to break flow. I unfortunately have not
found a comparable tool.&lt;/p&gt;

&lt;p&gt;In addition to the immediate effects of being able to spot-check certain inputs
and outputs, continued use of these tools builds long-term intuition about the
behavior of the compiler. It builds &lt;em&gt;mechanical sympathy&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I haven’t written a lot about mechanical sympathy other than my grad school
&lt;a href=&quot;/assets/img/statement-of-purpose.pdf&quot;&gt;statement of purpose&lt;/a&gt; (PDF) and a few
brief internet posts, so I will leave you with that for now.&lt;/p&gt;

&lt;h2 id=&quot;every-function-is-special&quot;&gt;Every function is special&lt;/h2&gt;

&lt;p&gt;Your compiler likely compiles some applications and you can likely get access
to the IR for the functions in that application.&lt;/p&gt;

&lt;p&gt;Scroll through every function’s optimized IR. If there are too many, maybe the
top N functions’ IRs. See what can be improved. Maybe you will see some
unexpected patterns. Even if you don’t notice anything in May, that could shift
by August because of compiler advancements or a cool paper that you read in the
intervening months.&lt;/p&gt;

&lt;p&gt;One time I found a bizarre reference counting bug that was causing
copy-on-write and potential memory issues by noticing that some objects that
should have been marked “immortal” in the IR were actually being refcounted.
The bug was not in the compiler, but far away in application setup code—and
yet it was visible in the IR.&lt;/p&gt;

&lt;h2 id=&quot;love-your-tools&quot;&gt;Love your tools&lt;/h2&gt;

&lt;p&gt;My conclusion is similar to Vicki’s.&lt;/p&gt;

&lt;p&gt;Put some love into your tools. Your colleagues will notice. Your users will
notice. It might even improve your mood.&lt;/p&gt;

&lt;h2 id=&quot;acknowledgements&quot;&gt;Acknowledgements&lt;/h2&gt;

&lt;p&gt;Thank you to &lt;a href=&quot;https://cfbolz.de/&quot;&gt;CF&lt;/a&gt; for feedback on the post.&lt;/p&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:actual-fix&quot;&gt;
      &lt;p&gt;The &lt;a href=&quot;https://github.com/pypy/pypy/commit/a31689c0b5977f8a73cca87c216dc8884aa34a76&quot;&gt;actual
fix&lt;/a&gt;
that checks for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;float_abs(float_abs(x))&lt;/code&gt; and rewrites to
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;float_abs(x)&lt;/code&gt;. &lt;a href=&quot;#fnref:actual-fix&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:log-off&quot;&gt;
      &lt;p&gt;Just make sure to log off and touch grass. &lt;a href=&quot;#fnref:log-off&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Tue, 23 Sep 2025 00:00:00 +0000</pubDate>
            <niceDate>September 23, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/walking-around/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/walking-around/</guid>
        </item>
        
        <item>
            <title>Linear scan with lifetime holes</title>
            <description>
&lt;p&gt;In my &lt;a href=&quot;/blog/linear-scan/&quot;&gt;last post&lt;/a&gt;, I explained a bit about how to retrofit
SSA onto the original linear scan algorithm. I went over all of the details for
how to go from low-level IR to register assignments—liveness analysis,
scheduling, building intervals, and the actual linear scan algorithm.&lt;/p&gt;

&lt;p&gt;Basically, we made it to 1997 linear scan, with small adaptations for
allocating directly on SSA.&lt;/p&gt;

&lt;p&gt;This time, we’re going to retrofit &lt;em&gt;lifetime holes&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id=&quot;lifetime-holes&quot;&gt;Lifetime holes&lt;/h2&gt;

&lt;p&gt;Lifetime holes come into play because a linearized sequence of instructions is
not a great proxy for storing or using metadata about a program originally
stored as a graph.&lt;/p&gt;

&lt;p&gt;According to &lt;a href=&quot;/assets/img/wimmer-linear-scan-ssa.pdf&quot;&gt;Linear Scan Register Allocation on SSA Form&lt;/a&gt; (PDF,
2010):&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;The lifetime interval of a virtual register must cover all parts where this
register is needed, with lifetime holes in between. Lifetime holes occur
because the control flow graph is reduced to a list of blocks before register
allocation. If a register flows into an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;else&lt;/code&gt;-block, but not into the
corresponding &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;if&lt;/code&gt;-block, the lifetime interval has a hole for the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;if&lt;/code&gt;-block.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Lifetime holes come from &lt;a href=&quot;/assets/img/quality-speed-linear-scan-ra-clean.pdf&quot;&gt;Quality and Speed in Linear-scan Register
Allocation&lt;/a&gt; (PDF, 1998) by Traub, Holloway, and Smith. Figure 1,
though not in SSA form, is a nice diagram for understanding how lifetime holes
may occur. Unfortunately, the paper contains a rather sparse plaintext
description of their algorithm that I did not understand how to apply to my
concrete allocator.&lt;/p&gt;

&lt;p&gt;Thankfully, other papers continued this line of research in (at least) 2002,
2005, and 2010. We will piece snippets from those papers together to understand
what’s going on.&lt;/p&gt;

&lt;p&gt;Let’s take a look at the sample IR snippet from Wimmer2010 to illustrate how
lifetime holes form:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;16: label B1(R10, R11):
18: jmp B2($1, R11)
     # vvvvvvvvvv #
20: label B2(R12, R13)
22: cmp R13, $1
24: branch lessThan B4() else B3()

26: label B3()
28: mul R12, R13 -&amp;gt; R14
30: sub R13, $1 -&amp;gt; R15
32: jump B2(R14, R15)

34: label B4()
     # ^^^^^^^^^^ #
36: add R10, R12 -&amp;gt; R16
38: ret R16
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Virtual register R12 is not used between position 28 and 34. For this reason,
Wimmer’s interval building algorithm assigns it the interval &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[[20, 28), [34,
...)]&lt;/code&gt;. Note how the interval has two disjoint ranges with space in the middle.&lt;/p&gt;

&lt;p&gt;Our simplified interval building algorithm from last time gave us—in the same
notation—the interval &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[[20, ...)]&lt;/code&gt; (well, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[[20, 36)]&lt;/code&gt; in our modified
snippet). This simplified interval only supports one range with no lifetime
holes.&lt;/p&gt;

&lt;p&gt;Ideally we would be able to use the physical register assigned to R12 for
another virtual register in this empty slot! For example, maybe R14 or R15,
which have short lifetimes that completely fit into the hole.&lt;/p&gt;

&lt;p&gt;Another example is a control-flow diamond. In this example, B1 jumps to either
B3 or B2, which then merge at B4. Virtual register R0 is defined in B1 and only
used in one of the branches, B3. It’s also not used in B4—if it were used in
B4, it would be live in both B2 and B3!&lt;/p&gt;

&lt;!--
# dot IN.dot -Tsvg -Nfontname=Monospace -Efontname=Monospace &gt; OUT.svg

digraph G {
node [shape=plaintext]
B1 [label=&lt;&lt;TABLE BORDER=&quot;0&quot; CELLBORDER=&quot;1&quot; CELLSPACING=&quot;0&quot;&gt;
&lt;TR&gt;&lt;TD PORT=&quot;params&quot; BGCOLOR=&quot;lightgray&quot;&gt;B1()&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;0&quot;&gt;R0 = loadi $123&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;1&quot;&gt;blt →B3, →B2&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&gt;];
B1:s -&gt; B3:params:n;
B1:s -&gt; B2:params:n;
B2 [label=&lt;&lt;TABLE BORDER=&quot;0&quot; CELLBORDER=&quot;1&quot; CELLSPACING=&quot;0&quot;&gt;
&lt;TR&gt;&lt;TD PORT=&quot;params&quot; BGCOLOR=&quot;lightgray&quot;&gt;B2()&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;0&quot;&gt;R1 = loadi $456&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;1&quot;&gt;R2 = add R1, $1&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;2&quot;&gt;jump →B4&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&gt;];
B2:s -&gt; B4:params:n;
B3 [label=&lt;&lt;TABLE BORDER=&quot;0&quot; CELLBORDER=&quot;1&quot; CELLSPACING=&quot;0&quot;&gt;
&lt;TR&gt;&lt;TD PORT=&quot;params&quot; BGCOLOR=&quot;lightgray&quot;&gt;B3()&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;0&quot;&gt;R3 = mul R0, $2&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;1&quot;&gt;jump →B4&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&gt;];
B3:s -&gt; B4:params:n;
B4 [label=&lt;&lt;TABLE BORDER=&quot;0&quot; CELLBORDER=&quot;1&quot; CELLSPACING=&quot;0&quot;&gt;
&lt;TR&gt;&lt;TD PORT=&quot;params&quot; BGCOLOR=&quot;lightgray&quot;&gt;B4()&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;0&quot;&gt;ret $5&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&gt;];
}
--&gt;
&lt;figure&gt;
&lt;object class=&quot;svg&quot; type=&quot;image/svg+xml&quot; data=&quot;/assets/img/lsra-diamond-cfg.svg&quot;&gt;&lt;/object&gt;
&lt;/figure&gt;

&lt;p&gt;Once we schedule it, the need for lifetime holes becomes more apparent:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;0: label B1:
2: R0 = loadi $123
4: blt iftrue: →B3, iffalse: →B2

6: label B2:
8: R1 = loadi $456
10: R2 = add R1, $1
12: jump →B4

14: label B3:
16: R3 = mul R0, $2
18: jump →B4

20: label B4:
22: ret $5
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Since B2 gets scheduled (in this case, arbitrarily) before B3, there’s a gap
where R0—which is completely unused in B2—would otherwise take up space in
our simplified interval form. Let’s fix that by adding some lifetime holes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Even though&lt;/strong&gt; we are adding some gaps between ranges, each interval still
gets assigned &lt;em&gt;one location for its entire life&lt;/em&gt;. It’s just that in the gaps,
we get to put other smaller intervals, like lichen growing between bricks.&lt;/p&gt;

&lt;p&gt;To get lifetime holes, we have to modify our interval data structure a bit.&lt;/p&gt;

&lt;h2 id=&quot;finding-lifetime-holes&quot;&gt;Finding lifetime holes&lt;/h2&gt;

&lt;p&gt;Our interval currently only supports a single range:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;nb&quot;&gt;attr_reader&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:range&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;initialize&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;set_from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We can change this to support multiple ranges by changing &lt;em&gt;just one character&lt;/em&gt;!!!&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;nb&quot;&gt;attr_reader&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:ranges&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;initialize&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;set_from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Har har. Okay, so we now have an array of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Range&lt;/code&gt; instead of just a single
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Range&lt;/code&gt;. But now we have to implement the methods differently.&lt;/p&gt;

&lt;p&gt;We’ll start with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;initialize&lt;/code&gt;. The start state of an interval is an empty array
of ranges:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;initialize&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Because we’re iterating backwards through the blocks and backwards through
instructions in each block, we’ll be starting with instruction 38 and working
our way linearly backwards until 16.&lt;/p&gt;

&lt;p&gt;This means that we’ll see later uses before earlier uses, and uses before defs.
In order to keep the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@ranges&lt;/code&gt; array in sorted order, we need to add each new
range to the front. This is O(n) in an array, so use a deque or linked list.
(Alternatively, push to the end and then reverse them afterwards.)&lt;/p&gt;

&lt;!-- TODO why keep them disjoint? --&gt;

&lt;p&gt;We keep the ranges in sorted order because it makes keeping them disjoint
easier, as we’ll see in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add_range&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;set_from&lt;/code&gt;. Let’s start with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;set_from&lt;/code&gt;
since it’s very similar to the previous version:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;set_from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;empty?&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# @ranges is empty when we don&apos;t have a use of the vreg&lt;/span&gt;
      &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
      &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;assert_sorted_and_disjoint&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add_range&lt;/code&gt; has a couple more cases, but we’ll go through them step by step.
First, a quick check that the range is the right way ‘round:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;ArgumentError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Invalid range: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; to &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then we have a straightforward case: if we don’t have any ranges yet, add a
brand new one:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;empty?&lt;/span&gt;
      &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;But if we do have ranges, this new range might be totally subsumed by the
existing first range. This happens if a virtual register is live for the
entirety of a block and also used inside that block. The uses that cause an
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add_range&lt;/code&gt; don’t add any new information:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;cover?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;..&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;assert_sorted_and_disjoint&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Another case is that the new range has a partial overlap with the existing
first range. This happens when we’re adding ranges for all of the live-out
virtual registers; the range for the predecessor block (say &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[4, 8]&lt;/code&gt;) will abut
the range for the successor block (say &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[8, 12]&lt;/code&gt;). We merge these ranges into
one big range (say, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[4, 12]&lt;/code&gt;):&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;cover?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;assert_sorted_and_disjoint&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;The last case is the case that gives us lifetime holes and happens when the new
range is already completely disjoint from the existing first range. That is
also a straightforward case: put the new range in at the start of the list.&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# TODO(max): Use a linked list or deque or something to avoid O(n) insertions&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@ranges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;insert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;assert_sorted_and_disjoint&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is all fine and good. I added this to the register allocator to test out
the lifetime hole finding but kept the rest of the same (changed the APIs
slightly so the interval could pretend it was still one big range). The tests
passed. Neat!&lt;/p&gt;

&lt;p&gt;I also verified that the lifetime holes were what we expected. This means our
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build_intervals&lt;/code&gt; function works unmodified with the new &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Interval&lt;/code&gt;
implementation. That makes sense, given that we copied the implementation off
of Wimmer2010, which can deal with lifetime holes.&lt;/p&gt;

&lt;p&gt;Now we would like to use this new information in the register allocator.&lt;/p&gt;

&lt;h2 id=&quot;modified-linear-scan&quot;&gt;Modified linear scan&lt;/h2&gt;

&lt;p&gt;It took a little bit of untangling, but the required modifications to support
lifetime holes in the register assignment phase are not too invasive. To get an
idea of the difference, I took the original &lt;a href=&quot;/assets/img/linearscan-ra.pdf&quot;&gt;Poletto1999&lt;/a&gt; (PDF) algorithm
and rewrote it in the style of the &lt;a href=&quot;/assets/img/linear-scan-ra-context-ssa.pdf&quot;&gt;Mössenböck2002&lt;/a&gt; (PDF)
algorithm.&lt;/p&gt;

&lt;p&gt;For example, here is Poletto1999:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;LinearScanRegisterAllocation
active ← {}
foreach live interval i, in order of increasing start point
  ExpireOldIntervals(i)
  if length(active) = R then
    SpillAtInterval(i)
  else
    register[i] ← a register removed from pool of free registers
    add i to active, sorted by increasing end point

ExpireOldIntervals(i)
foreach interval j in active, in order of increasing end point
  if endpoint[j] ≥ startpoint[i] then
    return
  remove j from active
  add register[j] to pool of free registers

SpillAtInterval(i)
spill ← last interval in active
if endpoint[spill] &amp;gt; endpoint[i] then
  register[i] ← register[spill]
  location[spill] ← new stack location
  remove spill from active
  add i to active, sorted by increasing end point
else
  location[i] ← new stack location
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And here it is again, reformatted a bit. The implicit &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;unhandled&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;handled&lt;/code&gt;
sets that don’t get names in Poletto1999 now get names. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ExpireOldIntervals&lt;/code&gt; is
inlined and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;SpillAtInterval&lt;/code&gt; gets a new name:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;LINEARSCAN()
unhandled ← all intervals in increasing order of their start points
active ← {}; handled ← {}
free ← set of available registers
while unhandled ≠ {} do
  cur ← pick and remove the first interval from unhandled
  //----- check for active intervals that expired
  for each interval i in active do
    if i ends before cur.beg then
      move i to handled and add i.reg to free

  //----- collect available registers in f
  f ← free

  //----- select a register from f
  if f = {} then
    ASSIGNMEMLOC(cur) // see below
  else
    cur.reg ← any register in f
    free ← free – {cur.reg}
    move cur to active

ASSIGNMEMLOC(cur: Interval)
spill ← last interval in active
if spill.end &amp;gt; cur.end then
  cur.reg ← spill.reg
  spill.location ← new stack location
  move spill from active to handled
  move cur to active
else
  cur.location ← new stack location
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now we can pick out all of the bits of Mössenböck2002 that look like they are
responsible for dealing with lifetime holes.&lt;/p&gt;

&lt;p&gt;For example, the algorithm now has a fourth set, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inactive&lt;/code&gt;. This set holds
intervals that have holes that contain the current interval’s start position.
These intervals are assigned registers that are potential candidates for the
current interval to live (more on this in a sec).&lt;/p&gt;

&lt;p&gt;I say potential candidates because in order for them to be a home for the
current interval, an inactive interval has to be completely disjoint from the
current interval. If they overlap at all—in any of their ranges—then we
would be trying to put two virtual registers into one physical register at the
same program point. That’s a bad compile.&lt;/p&gt;

&lt;p&gt;We have to do a little extra bookkeeping in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ASSIGNMEMLOC&lt;/code&gt; because now one
physical register can be assigned to more than one interval that is still in
the middle of being processed (active and inactive sets). If we choose to
spill, we have to make sure that all conflicting uses of the register
(intervals that overlap with the current interval) get reassigned locations.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;LINEARSCAN()
unhandled ← all intervals in increasing order of their start points
active ← {}; handled ← {}
inactive ← {}
free ← set of available registers
while unhandled ≠ {} do
  cur ← pick and remove the first interval from unhandled
  //----- check for active intervals that expired
  for each interval i in active do
    if i ends before cur.beg then
      move i to handled and add i.reg to free
    else if i does not overlap cur.beg then
      move i to inactive and add i.reg to free
  //----- check for inactive intervals that expired or become reactivated
  for each interval i in inactive do
    if i ends before cur.beg then
      move i to handled
    else if i overlaps cur.beg then
      move i to active and remove i.reg from free

  //----- collect available registers in f
  f ← free
  for each interval i in inactive that overlaps cur do f ← f – {i.reg}

  //----- select a register from f
  if f = {} then
    ASSIGNMEMLOC(cur) // see below
  else
    cur.reg ← any register in f
    free ← free – {cur.reg}
    move cur to active

ASSIGNMEMLOC(cur: Interval)
spill ← heuristic: pick some interval from active or inactive
if spill.end &amp;gt; cur.end then
  r = spill.reg
  conflicting = set of active or inactive intervals with register r that
    overlap with cur
  move all intervals in conflicting to handled
  assign memory locations to them
  cur.reg ← r
  move cur to active
else
  cur.location ← new stack location
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that this begins to depart from strictly linear (time) linear scan: the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inactive&lt;/code&gt; set is bounded not by the number of physical registers but instead
by the number of virtual registers. Mössenböck2002 notes that the size of the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;inactive&lt;/code&gt; set is generally very small, though, so “linear in practice”.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;EDIT: After re-reading Wimmer2010, I noticed that they say:&lt;/p&gt;

  &lt;blockquote&gt;
    &lt;p&gt;[…] introduced non-linear parts. Two of them are highlighted in Figure 6
where the set of inactive intervals is iterated. The set can contain an
arbitrary number of intervals since it is not bound by the number of
physical registers. Testing the current interval for intersection with all
of them can therefore be expensive.&lt;/p&gt;

    &lt;p&gt;When the lifetime intervals are created from code in SSA form, this test is
not necessary anymore: All intervals in inactive start before the current
interval, so they do not intersect with the current interval at their
definition. They are inactive and thus have a lifetime hole at the current
position, so they do not intersect with the current interval at its
definition. SSA form therefore guarantees that they never intersect [7],
making the entire loop that tests for intersection unnecessary.&lt;/p&gt;

    &lt;p&gt;Unfortunately, splitting of intervals leads to intervals that no longer
adhere to the SSA form properties because it destroys SSA form. Therefore,
the intersection test cannot be omitted completely; it must be performed if
the current interval has been split off from another interval.&lt;/p&gt;
  &lt;/blockquote&gt;

  &lt;p&gt;Which indicates to me that we may actually be able to leave off that loop
over the inactive intervals after all? Unclear. I’ll have to come back and
mess with this later.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I left out the parts about register weights that are heuristics to improve
register allocation. They are not core to supporting lifetime holes. You can
add them back in if you like.&lt;/p&gt;

&lt;p&gt;Here is a text diff to make it clear what changed:&lt;/p&gt;

&lt;div class=&quot;language-diff highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gh&quot;&gt;diff --git a/tmp/lsra b/tmp/lsra-holes
index e9de35b..de79a63 100644
&lt;/span&gt;&lt;span class=&quot;gd&quot;&gt;--- a/tmp/lsra
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+++ b/tmp/lsra-holes
&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;@@ -1,6 +1,7 @@&lt;/span&gt;
 LINEARSCAN()
 unhandled ← all intervals in increasing order of their start points
 active ← {}; handled ← {}
&lt;span class=&quot;gi&quot;&gt;+inactive ← {}
&lt;/span&gt; free ← set of available registers
 while unhandled ≠ {} do
   cur ← pick and remove the first interval from unhandled
&lt;span class=&quot;p&quot;&gt;@@ -8,9 +9,18 @@&lt;/span&gt; while unhandled ≠ {} do
   for each interval i in active do
     if i ends before cur.beg then
       move i to handled and add i.reg to free
&lt;span class=&quot;gi&quot;&gt;+    else if i does not overlap cur.beg then
+      move i to inactive and add i.reg to free
+  //----- check for inactive intervals that expired or become reactivated
+  for each interval i in inactive do
+    if i ends before cur.beg then
+      move i to handled
+    else if i overlaps cur.beg then
+      move i to active and remove i.reg from free
&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;
&lt;/span&gt;   //----- collect available registers in f
   f ← free
&lt;span class=&quot;gi&quot;&gt;+  for each interval i in inactive that overlaps cur do f ← f – {i.reg}
&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;
&lt;/span&gt;   //----- select a register from f
   if f = {} then
&lt;span class=&quot;p&quot;&gt;@@ -23,10 +33,10 @@&lt;/span&gt; while unhandled ≠ {} do
 ASSIGNMEMLOC(cur: Interval)
&lt;span class=&quot;gd&quot;&gt;-spill ← last interval in active
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+spill ← heuristic: pick some interval from active or inactive
&lt;/span&gt; if spill.end &amp;gt; cur.end then
&lt;span class=&quot;gd&quot;&gt;-  cur.reg ← spill.reg
-  spill.location ← new stack location
-  move spill from active to handled
&lt;/span&gt;&lt;span class=&quot;gi&quot;&gt;+  r = spill.reg
+  conflicting = set of active or inactive intervals with register r that
+    overlap with cur
+  move all intervals in conflicting to handled
+  assign memory locations to them
+  cur.reg ← r
&lt;/span&gt;   move cur to active
 else
   cur.location ← new stack location
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This reformatting and diffing made it much easier for me to reason about what
specifically had to be changed.&lt;/p&gt;

&lt;p&gt;There’s just one thing left after register assignment: resolution and SSA
deconstruction.&lt;/p&gt;

&lt;h2 id=&quot;resolution-and-ssa-destruction&quot;&gt;Resolution and SSA destruction&lt;/h2&gt;

&lt;p&gt;I’m pretty sure we can actually just keep the resolution the same. In our
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;resolve&lt;/code&gt; function, we are only making sure that the block arguments get
parallel-moved into the block parameters. That hasn’t changed.&lt;/p&gt;

&lt;p&gt;Wimmer2010 says:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Linear scan register allocation with splitting of lifetime intervals requires
a resolution phase after the actual allocation. Because the control flow
graph is reduced to a list of blocks, control flow is possible between blocks
that are not adjacent in the list. When the location of an interval is
different at the end of the predecessor and at the start of the successor, a
move instruction must be inserted to resolve the conflict.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s great news for us: we don’t do splitting. An interval, though it has
lifetime holes, still only ever has one location for its entire life. So once
an interval begins, we don’t need to think about moving its contents.&lt;/p&gt;

&lt;p&gt;So I was actually overly conservative in the previous post, which I have
amended!&lt;/p&gt;

&lt;h2 id=&quot;fixed-intervals-and-register-constraints&quot;&gt;Fixed intervals and register constraints?&lt;/h2&gt;

&lt;p&gt;Mössenböck2002 also tackles register constraints with this notion of “fixed
intervals”—intervals that have been pre-allocated physical registers.&lt;/p&gt;

&lt;p&gt;Since I eventually want to use “register hinting” from Wimmer2005 and
Wimmer2010, I’m going to ignore the fixed interval part of Mössenböck2002 for
now. It seems like they work nicely together.&lt;/p&gt;

&lt;h2 id=&quot;wrapping-up&quot;&gt;Wrapping up&lt;/h2&gt;

&lt;p&gt;We added lifetime holes to our register allocator without too much effort. This
better maps the graph-like nature of the IR onto the linear sequence of
instructions and should get us some better allocation for short-lived virtual
registers.&lt;/p&gt;

&lt;p&gt;Maybe next time we will add &lt;em&gt;interval splitting&lt;/em&gt;, which will help us a) address
ABI constraints more cleanly in function calls and b) remove the dependence on
reserving a scratch register.&lt;/p&gt;
</description>
            <pubDate>Sun, 24 Aug 2025 00:00:00 +0000</pubDate>
            <niceDate>August 24, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/linear-scan-lifetime-holes/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/linear-scan-lifetime-holes/</guid>
        </item>
        
        <item>
            <title>Liveness analysis with Datalog</title>
            <description>&lt;p&gt;After publishing &lt;a href=&quot;/blog/linear-scan&quot;&gt;Linear scan register allocation on SSA&lt;/a&gt;, I
had a nice call with &lt;a href=&quot;https://waleedkhan.name&quot;&gt;Waleed Khan&lt;/a&gt; where he showed me
how to Datalog. He thought it might be useful to try implementing liveness
analysis as a Datalog problem.&lt;/p&gt;

&lt;p&gt;We started off with the Wimmer2010 CFG example from that post, sketching out
manually which variables were live out of each block: R10 out of B1, R12 out of
B2, etc.&lt;/p&gt;

&lt;figure&gt;
&lt;object class=&quot;svg&quot; type=&quot;image/svg+xml&quot; data=&quot;/assets/img/wimmer-lsra-cfg.svg&quot;&gt;&lt;/object&gt;
&lt;figcaption&gt;
    &lt;p&gt;The graph from Wimmer2010 has come back! Remember, we’re using block arguments
instead of phis, so &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;B1(R10, R11)&lt;/code&gt; defines R10 and R11 before the first
instruction in B1.&lt;/p&gt;
  &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Then we tried to formulate liveness as a Datalog relation.&lt;/p&gt;

&lt;p&gt;Liveness is normally (at least for me) defined in terms of two relations:
live-in and live-out. Live-out is “what is needed” from all of the successors
of a block and live-in is the “what is needed” summary for a block. So, in
fake math notation:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;live-out(b) = union(live-in(s) for each successor s of b)
live-in(b) = (live-out(b) + used(b)) - defined(b)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;where each of the component parts of that expression represent sets of
variables:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;em&gt;used(b)&lt;/em&gt; is the set of variables referenced as in-operands to instructions in
a block&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;defined(b)&lt;/em&gt; is the set of variables defined by instructions in a block&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We ended up computing the live-in sets for blocks in the register allocator
post but then using the live-out sets instead. So today let’s compute both
live-in and live-out sets with Datalog!&lt;/p&gt;

&lt;h2 id=&quot;datalog&quot;&gt;Datalog&lt;/h2&gt;

&lt;p&gt;Datalog is a logic programming language. It probably looks and feels different
from every other programming language you have used… except for maybe SQL. It
might feel similar to SQL, except SQL has a certain order to it that Datalog
does not.&lt;/p&gt;

&lt;p&gt;We’ll be using Souffle here because Waleed mentioned it and also I learned a
bit about it in my databases class.&lt;/p&gt;

&lt;p&gt;The thing you do first is define your relations, which is what Datalog calls a
table.&lt;/p&gt;

&lt;p&gt;In this case, if we want to compute liveness information, we have to know
information about what a block uses, defines, and what successors it has.&lt;/p&gt;

&lt;p&gt;First, the thing you have to know about Datalog, is that it’s kind of like
the opposite of array programming. We’re going to express things about sets by
expressing facts about individual items in a set.&lt;/p&gt;

&lt;p&gt;For example, we’re not going to say “this block B4 uses [R10, R12, R16]”. We’re
going to say three separate facts: “B4 uses R10”, “B4 uses R12”, “B4 uses R16”.
You can think about it like each relation being a database table where each
parameter is a column name.&lt;/p&gt;

&lt;p&gt;Here are the relations for block uses, block defs, and which blocks follow
other blocks:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// liveness.dl
.decl block_use(block:symbol, var:symbol)
.decl block_def(block:symbol, var:symbol)
.decl block_succ(succ:symbol, pred:symbol)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Where &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;symbol&lt;/code&gt; here means string.&lt;/p&gt;

&lt;p&gt;We can then embed some facts inline. For example, this says “A defines R0 and
R1 and uses R0”:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;block_def(&quot;A&quot;, &quot;R0&quot;).
block_def(&quot;A&quot;, &quot;R1&quot;).
block_use(&quot;A&quot;, &quot;R0&quot;).
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;You can also provide facts as a TSV but this file format is so irritating to
construct manually and has given me silently wrong answers in Souffle before so
I am not doing that for this example.&lt;/p&gt;

&lt;p&gt;You can, for your edification, manually encode all the use/def/successor facts
from the previous post into Souffle—or you can copy this chunk into your file:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// liveness.dl
// ...
block_def(&quot;B1&quot;, &quot;R10&quot;).
block_def(&quot;B1&quot;, &quot;R11&quot;).
block_use(&quot;B1&quot;, &quot;R11&quot;).

block_def(&quot;B2&quot;, &quot;R12&quot;).
block_def(&quot;B2&quot;, &quot;R13&quot;).
block_use(&quot;B2&quot;, &quot;R13&quot;).

block_def(&quot;B3&quot;, &quot;R14&quot;).
block_def(&quot;B3&quot;, &quot;R15&quot;).
block_use(&quot;B3&quot;, &quot;R12&quot;).
block_use(&quot;B3&quot;, &quot;R13&quot;).
block_use(&quot;B3&quot;, &quot;R14&quot;).
block_use(&quot;B3&quot;, &quot;R15&quot;).

block_def(&quot;B4&quot;, &quot;R16&quot;).
block_use(&quot;B4&quot;, &quot;R16&quot;).
block_use(&quot;B4&quot;, &quot;R10&quot;).
block_use(&quot;B4&quot;, &quot;R12&quot;).

block_succ(&quot;B2&quot;, &quot;B1&quot;).
block_succ(&quot;B3&quot;, &quot;B2&quot;).
block_succ(&quot;B2&quot;, &quot;B3&quot;).
block_succ(&quot;B4&quot;, &quot;B2&quot;).
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We can declare our live-in and live-out relations similarly to our use/def/succ
relations. We mark them as being &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;.output&lt;/code&gt; so that Souffle presents us with the
results.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// liveness.dl
// ...
.decl live_out(block:symbol, var:symbol)
.output live_out
.decl live_in(block:symbol, var:symbol)
.output live_in
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Now it’s time to define our relations. You may notice that the Souffle
definitions look very similar to our earlier definitions. This is no mistake;
Datalog was created for dataflow and graph problems.&lt;/p&gt;

&lt;p&gt;We’ll start with live-out:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// liveness.dl
// ...
live_out(b, v) :- block_succ(s, b), live_in(s, v).
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We read this left to right as “a variable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v&lt;/code&gt; is live-out of block &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt; if block
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s&lt;/code&gt; is a successor of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v&lt;/code&gt; is live-in to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s&lt;/code&gt;”. The &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;:-&lt;/code&gt; defines the left
side in terms of the right side. The comma between &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;block_succ&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;live_in&lt;/code&gt;
means it’s a conjunction—&lt;em&gt;and&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Where’s the union? Well, remember what I said about array programming? We’re
not thinking in terms of sets. We’re thinking one program variable at a time.
As Souffle executes our relations, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;live_out&lt;/code&gt; will incrementally build up a
table.&lt;/p&gt;

&lt;p&gt;It’s also a little weird to program in this style because &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s&lt;/code&gt; wasn’t textually
defined anywhere like a parameter or a variable. You kind of have to think of
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;s&lt;/code&gt; as connector, a binder, a foreign key—what have you. It’s a placeholder.
(I don’t know how to explain this well. Sorry.)&lt;/p&gt;

&lt;p&gt;Then we can define live-in. This on the surface looks more complicated but I
think that is only because of Souffle’s choice of syntax.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;// liveness.dl
// ...
live_in(b, v) :- (live_out(b, v) ; block_use(b, v)), !block_def(b, v).
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It reads as “a variable &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;v&lt;/code&gt; is live-in to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt; if it is either live-out of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt;
or used in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt;, and &lt;em&gt;not&lt;/em&gt; defined in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;b&lt;/code&gt;. The semicolons are
disjunctions—&lt;em&gt;or&lt;/em&gt;—and the exclamation points negations—&lt;em&gt;not&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;These relations look endlessly mutually recursive but you have to keep in mind
that we’re not running functions on data, exactly. We’re declaratively
expressing definitions of rules—relations. &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;block_use(b, v)&lt;/code&gt; in the body of
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;live_in&lt;/code&gt; is not calling a function but instead making a query—is the row
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;(b, v)&lt;/code&gt; in the table &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;block_use&lt;/code&gt;? Datalog builds the tables until saturation.&lt;/p&gt;

&lt;p&gt;Now we can run Souffle! We tell it to dump to standard output with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-D-&lt;/code&gt; but
you could just as easily have it dump each output relation in its own separate
file in the current directory by specifying &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;-D.&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-console highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;souffle &lt;span class=&quot;nt&quot;&gt;-D-&lt;/span&gt; liveness.dl
&lt;span class=&quot;go&quot;&gt;---------------
live_in
block   var
===============
B2      R10
B3      R10
B3      R12
B3      R13
B4      R10
B4      R12
===============
---------------
live_out
block   var
===============
B1      R10
B2      R10
B2      R12
B2      R13
B3      R10
===============
&lt;/span&gt;&lt;span class=&quot;gp&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That’s neat. We got nicely formatted tables and it only took us two lines of
code! Let’s compare to our Ruby code from the previous post to underscore the
point:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;analyze_liveness&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;order&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;post_order&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compute_initial_liveness_sets&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;changed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;true&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;changed&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;changed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;false&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;order&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;successors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;succ&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;succ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;changed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;true&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This is because we have separated the iteration-to-fixpoint bit from the main
bit of the dataflow analysis: the equation. If we let Datalog do the data
movement for us, we can work on defining the rules—and only the rules.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This is probably why, in the fullness of time, many static analysis and
compiler tools end up growing some kind of embedded (partial) Datalog engine.
Call it Scholz’s tenth rule.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Souffle also has the ability to compile to C++, which gives you two nice
things:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;you can probably get faster execution&lt;/li&gt;
  &lt;li&gt;you can use it from an existing C++ program&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I don’t have any experience with this API.&lt;/p&gt;

&lt;p&gt;This is when Waleed mentioned offhandedly that he had heard about some embedded
Rust datalog called &lt;a href=&quot;https://s-arash.github.io/ascent/&quot;&gt;Ascent&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;rust&quot;&gt;Rust&lt;/h2&gt;

&lt;p&gt;The front page of the Ascent website is a really great sell if you show up
thinking “gee, I wish I had Datalog to use in my Rust program”. Right out the
gate, you get reasonable-enough Datalog syntax via a proc macro.&lt;/p&gt;

&lt;p&gt;For example, here is the canonical path example for Souffle:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;.decl edge(x:number, y:number)
.decl path(x:number, y:number)

path(x, y) :- edge(x, y).
path(x, y) :- path(x, z), edge(z, y).
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;and in Ascent:&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;ascent!&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;relation&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;i32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;i32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
   &lt;span class=&quot;n&quot;&gt;relation&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;i32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;i32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

   &lt;span class=&quot;nf&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;--&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
   &lt;span class=&quot;nf&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;--&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Super.&lt;/p&gt;

&lt;p&gt;We weren’t sure if the Souffle liveness would port cleanly to Rust, but it sure
did! It even lets you use your own datatypes instead of just &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;i32&lt;/code&gt; (which the
front-page example uses).&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;use&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;ascent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ascent&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;;&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;#[derive(Clone,&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;PartialEq,&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;Eq,&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;Hash,&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;Copy)]&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;i32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;impl&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Debug&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BlockId&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Formatter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;&apos;_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Result&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;write!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;B{}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;#[derive(Clone,&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;PartialEq,&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;Eq,&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;Hash,&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;Copy)]&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;struct&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;i32&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;impl&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Debug&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VarId&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Formatter&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt;&lt;span class=&quot;nv&quot;&gt;&apos;_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;std&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;fmt&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;Result&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;nd&quot;&gt;write!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;R{}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;self&lt;/span&gt;&lt;span class=&quot;na&quot;&gt;.0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;ascent!&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;relation&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;block_use&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;relation&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;block_def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;relation&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;block_succ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;// (succ, pred)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;relation&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;live_out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;relation&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;live_out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;--&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;block_succ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;nf&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;--&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;live_out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;block_use&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;block_def&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Notice how we don’t have an &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;input&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;output&lt;/code&gt; annotation like we did in
Datalog. That’s because this is designed to be embedded in an existing program,
which probably doesn’t to deal with the disk (or at least wants to read/write
in its own format).&lt;/p&gt;

&lt;p&gt;Ascent lets us give it some vectors of data and then at the end lets us read
some vectors of data too.&lt;/p&gt;

&lt;div class=&quot;language-rust highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;// ...&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;fn&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;main&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;mut&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prog&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nn&quot;&gt;AscentProgram&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;default&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b4&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;BlockId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r10&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r11&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r12&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r13&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r14&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;14&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r15&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;15&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;let&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r16&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;VarId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;prog&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.block_def&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;vec!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r11&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r14&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r15&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;prog&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.block_succ&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;vec!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;prog&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.block_use&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;vec!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r11&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r13&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r14&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b3&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r15&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r10&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r12&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b4&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r16&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;];&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;prog&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;.run&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;live out: {:?}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prog&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.live_out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;nd&quot;&gt;println!&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;live in: {:?}&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;prog&lt;/span&gt;&lt;span class=&quot;py&quot;&gt;.live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Then we need only run &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cargo add ascent&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;cargo run&lt;/code&gt;—both of which worked
with zero issues—and see the results.&lt;/p&gt;

&lt;div class=&quot;language-console highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;gp&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;cargo run
&lt;span class=&quot;go&quot;&gt;    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.02s
     Running `target/debug/liveness`
live out: [(B2, R12), (B2, R13), (B2, R10), (B1, R10), (B3, R10)]
live in: [(B3, R12), (B3, R13), (B4, R10), (B4, R12), (B2, R10), (B3, R10)]
&lt;/span&gt;&lt;span class=&quot;gp&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;w&quot;&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;It’s not a fancy looking table, but it’s very close to my program, which is
neat.&lt;/p&gt;

&lt;p&gt;This is similar to embedding Souffle in C++ and then calling the C++. One
difference, though, is the Souffle process has two steps. It’s a slight build
system complication. But this isn’t meant to be a Datalog comparison post!&lt;/p&gt;

&lt;h2 id=&quot;more&quot;&gt;More?&lt;/h2&gt;

&lt;p&gt;Can we model all of linear scan this way? Maybe. I’m new to all this stuff.&lt;/p&gt;

&lt;p&gt;Ascent also seems to support lattices, which means we can use it to do abstract
interpretation on some cool domains.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://pointersgonewild.com/&quot;&gt;Maxime Chevalier-Boisvert&lt;/a&gt; and I prototyped
&lt;a href=&quot;https://github.com/shopify/loupe&quot;&gt;loupe&lt;/a&gt;, an interprocedural type analysis in
Rust. We had to build our own iterate-to-fixpoint engine, which was
non-trivial. I wonder how it would look to build something similar on top of
Ascent.&lt;/p&gt;

&lt;p&gt;I kind of want to check out &lt;a href=&quot;https://github.com/frankmcsherry/&quot;&gt;Frank
McSherry&lt;/a&gt;’s
&lt;a href=&quot;https://github.com/frankmcsherry/blog/blob/master/posts/2025-06-03.md&quot;&gt;datatoad&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;wrapping-up&quot;&gt;Wrapping up&lt;/h2&gt;

&lt;p&gt;That’s all for now, folks. Just a couple Datalog snippets. Happy hacking.&lt;/p&gt;
</description>
            <pubDate>Wed, 13 Aug 2025 00:00:00 +0000</pubDate>
            <niceDate>August 13, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/liveness-datalog/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/liveness-datalog/</guid>
        </item>
        
        <item>
            <title>Linear scan register allocation on SSA</title>
            <description>&lt;p&gt;&lt;em&gt;Much of the code and education that resulted in this post happened with &lt;a href=&quot;https://tenderlovemaking.com/&quot;&gt;Aaron
Patterson&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The fundamental problem in register allocation is to take an IR that uses a
virtual registers (as many as you like) and rewrite it to use a finite amount
of physical registers and stack space&lt;sup id=&quot;fnref:calendaring&quot;&gt;&lt;a href=&quot;#fn:calendaring&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;This is an example of a code snippet using virtual registers:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;add R1, R2 -&amp;gt; R3
add R1, R3 -&amp;gt; R4
ret R4
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;And here is the same example after it has been passed through a register
allocator (note that Rs changed to Ps):&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;add Stack[0], P0 -&amp;gt; P1
add Stack[0], P1 -&amp;gt; P0
ret
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Each virtual register was assigned a physical place: R1 to the stack, R2 to P0,
R3 to P1, and R4 &lt;em&gt;also&lt;/em&gt; to P0 (since we weren’t using R2 anymore).&lt;/p&gt;

&lt;p&gt;People use register allocators like they use garbage collectors: it’s an
abstraction that can manage your resources for you, maybe with some cost. When
writing the back-end of a compiler, it’s probably much easier to have a
separate register-allocator-in-a-box than manually managing variable lifetimes
while also considering all of your different target architectures.&lt;/p&gt;

&lt;p&gt;How do JIT compilers do register allocation? Well, “everyone knows” that “every
JIT does its own variant of linear scan”&lt;sup id=&quot;fnref:everyone&quot;&gt;&lt;a href=&quot;#fn:everyone&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;. This bothered me for some
time because I’ve worked on a couple of JITs and still didn’t understand the
backend bits.&lt;/p&gt;

&lt;p&gt;There are a couple different approaches to register allocation, but in this
post we’ll focus on &lt;em&gt;linear scan of SSA&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I started reading &lt;a href=&quot;/assets/img/wimmer-linear-scan-ssa.pdf&quot;&gt;Linear Scan Register Allocation on SSA Form&lt;/a&gt; (PDF,
2010) by Wimmer and Franz after writing &lt;a href=&quot;/blog/ssa/&quot;&gt;A catalog of ways to generate
SSA&lt;/a&gt;. Reading alone didn’t make a ton of sense—I ended up with a
lot of very frustrated margin notes. I started trying to implement it alongside
the paper. As it turns out, though, there is a rich history of papers in this
area that it leans on really heavily. I needed to follow the chain of
references!&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;For example, here is a lovely explanation of the process, start to finish,
from Christian Wimmer’s &lt;a href=&quot;/assets/img/wimmer-masters-thesis.pdf&quot;&gt;Master’s thesis&lt;/a&gt; (PDF, 2004).&lt;/p&gt;

  &lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;LINEAR_SCAN
  // order blocks and operations (including loop detection)
  COMPUTE_BLOCK_ORDER
  NUMBER_OPERATIONS
  // create intervals with live ranges
  COMPUTE_LOCAL_LIVE_SETS
  COMPUTE_GLOBAL_LIVE_SETS
  BUILD_INTERVALS
  // allocate registers
  WALK_INTERVALS
  RESOLVE_DATA_FLOW
  // replace virtual registers with physical registers
  ASSIGN_REG_NUM
  // special handling for the Intel FPU stack
  ALLOCATE_FPU_STACK
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;  &lt;/div&gt;

  &lt;p&gt;There it is, all laid out at once. It’s very refreshing when compared to all
of the compact research papers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I didn’t realize that there were more than one or two papers on linear scan. So
this post will also incidentally serve as a bit of a survey or a history of
linear scan—as best as I can figure it out, anyway. If you were in or near
the room where it happened, please feel free to reach out and correct some
parts.&lt;/p&gt;

&lt;h2 id=&quot;some-example-code&quot;&gt;Some example code&lt;/h2&gt;

&lt;p&gt;Throughout this post, we’ll use an example SSA code snippet from Wimmer2010,
adapted from phi-SSA to block-argument-SSA. Wimmer2010’s code snippet is
between the arrows and we add some filler (as alluded to in the paper):&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;label B1(R10, R11):
jmp B2($1, R11)
 # vvvvvvvvvv #
label B2(R12, R13)
cmp R13, $1
branch lessThan B4()

label B3()
mul R12, R13 -&amp;gt; R14
sub R13, $1 -&amp;gt; R15
jump B2(R14, R15)

label B4()
 # ^^^^^^^^^^ #
add R10, R12 -&amp;gt; R16
ret R16
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Virtual registers start with R and are defined either with an arrow or by a
block parameter.&lt;/p&gt;

&lt;p&gt;Because it takes a moment to untangle the unfamiliar syntax and draw the
control-flow graph by hand, I’ve also provided the same code in graphical form.
Block names (and block parameters) are shaded with grey.&lt;/p&gt;

&lt;!--
# dot IN.dot -Tsvg -Nfontname=Monospace -Efontname=Monospace &gt; OUT.svg

digraph G {
node [shape=plaintext]
B1 [label=&lt;&lt;TABLE BORDER=&quot;0&quot; CELLBORDER=&quot;1&quot; CELLSPACING=&quot;0&quot;&gt;
&lt;TR&gt;&lt;TD PORT=&quot;params&quot; BGCOLOR=&quot;lightgray&quot;&gt;B1(R10, R11)&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;0&quot;&gt;jump →B2($1, R11)&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&gt;];
B1:s -&gt; B2:params:n;
B2 [label=&lt;&lt;TABLE BORDER=&quot;0&quot; CELLBORDER=&quot;1&quot; CELLSPACING=&quot;0&quot;&gt;
&lt;TR&gt;&lt;TD PORT=&quot;params&quot; BGCOLOR=&quot;lightgray&quot;&gt;B2(R12, R13)&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;0&quot;&gt;cmp R13, $1&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;1&quot;&gt;blt →B4, →B3&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&gt;];
B2:s -&gt; B4:params:n;
B2:s -&gt; B3:params:n;
B3 [label=&lt;&lt;TABLE BORDER=&quot;0&quot; CELLBORDER=&quot;1&quot; CELLSPACING=&quot;0&quot;&gt;
&lt;TR&gt;&lt;TD PORT=&quot;params&quot; BGCOLOR=&quot;lightgray&quot;&gt;B3()&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;0&quot;&gt;R14 = mul R12, R13&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;1&quot;&gt;R15 = sub R13, $1&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;2&quot;&gt;jump →B2(R14, R15)&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&gt;];
B3:s -&gt; B2:params:n;
B4 [label=&lt;&lt;TABLE BORDER=&quot;0&quot; CELLBORDER=&quot;1&quot; CELLSPACING=&quot;0&quot;&gt;
&lt;TR&gt;&lt;TD PORT=&quot;params&quot; BGCOLOR=&quot;lightgray&quot;&gt;B4()&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;0&quot;&gt;R16 = add R10, R12&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;TR&gt;&lt;TD ALIGN=&quot;left&quot; PORT=&quot;1&quot;&gt;ret R16&amp;nbsp;&lt;/TD&gt;&lt;/TR&gt;
&lt;/TABLE&gt;&gt;];
}
--&gt;
&lt;figure&gt;
&lt;object class=&quot;svg&quot; type=&quot;image/svg+xml&quot; data=&quot;/assets/img/wimmer-lsra-cfg.svg&quot;&gt;&lt;/object&gt;
&lt;figcaption&gt;
    &lt;p&gt;We have one entry block, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;B1&lt;/code&gt;, that is implied in
Wimmer2010. Its only job is to define &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;R10&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;R11&lt;/code&gt; for the rest of the CFG.&lt;/p&gt;

    &lt;p&gt;Then we have a loop between &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;B2&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;B3&lt;/code&gt; with an implicit fallthrough. Instead
of doing that, we instead generate a conditional branch with explicit jump
targets. This makes it possible to re-order blocks as much as we like.&lt;/p&gt;

    &lt;p&gt;The contents of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;B4&lt;/code&gt; are also just to fill in the blanks from Wimmer2010 and
add some variable uses.&lt;/p&gt;
  &lt;/figcaption&gt;
&lt;/figure&gt;

&lt;p&gt;Our goal for the post is to analyze this CFG, assign physical locations
(registers or stack slots) to each virtual register, and then rewrite the code
appropriately.&lt;/p&gt;

&lt;p&gt;For now, let’s rewind the clock and look at how linear scan came about.&lt;/p&gt;

&lt;h2 id=&quot;in-the-beginning&quot;&gt;In the beginning&lt;/h2&gt;

&lt;p&gt;Linear scan register allocation (LSRA) has been around for awhile. It’s neat
because it does the actual register assignment part of register allocation in
one pass over your low-level IR. (We’ll talk more about what that means in a
minute.)&lt;/p&gt;

&lt;p&gt;It first appeared in the literature in &lt;a href=&quot;/assets/img/tcc-linearscan-ra.pdf&quot;&gt;tcc: A System for Fast, Flexible, and
High-level Dynamic Code Generation&lt;/a&gt; (PDF, 1997) by Poletto, Engler,
and Kaashoek. (Until writing this post, I had never seen this paper. It was
only on a re-read of the 1999 paper (below) that I noticed it.) In this paper,
they mostly describe a staged variant of C called ‘C (TickC), for which a fast
register allocator is quite useful.&lt;/p&gt;

&lt;p&gt;Then came a paper called &lt;a href=&quot;/assets/img/quality-speed-linear-scan-ra-clean.pdf&quot;&gt;Quality and Speed in Linear-scan Register
Allocation&lt;/a&gt; (PDF, 1998) by Traub, Holloway, and Smith. It adds
some optimizations (lifetime holes, binpacking) to the algorithm presented in
Poletto1997.&lt;/p&gt;

&lt;p&gt;Then came the first paper I read, and I think the paper everyone refers to when
they talk about linear scan: &lt;a href=&quot;/assets/img/linearscan-ra.pdf&quot;&gt;Linear Scan Register Allocation&lt;/a&gt; (PDF,
1999) by Poletto and Sarkar. In this paper, they give a fast alternative to
graph coloring register allocation, especially motivated by just-in-time
compilers. In retrospect, it seems to be a bit of a rehash of the previous two
papers.&lt;/p&gt;

&lt;p&gt;Linear scan (1997, 1999) operates on &lt;em&gt;live ranges&lt;/em&gt; instead of virtual
registers. A live range is a pair of integers [start, end) (end is exclusive)
that begins when the register is defined and ends when it is last used. This
means that there is an assumption that the order for instructions in your
program has already been fixed into a single linear sequence! It also means
that you have given each instruction a number that represents its position in
that order.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;This may or not be a surprising requirement depending on your compilers
background. It was surprising to me because I often live in control flow
graph fantasy land where blocks are unordered and instructions sometimes
float around. But if you live in a land of basic blocks that are already in
reverse post order, then it may be less surprising.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In non-SSA-land, these live ranges are different from the virtual registers:
they represent some kind of lifetimes of each &lt;em&gt;version&lt;/em&gt; of a virtual register.
For an example, consider the following code snippet:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;...      -&amp;gt; a
add 1, a -&amp;gt; b
add 1, b -&amp;gt; c
add 1, c -&amp;gt; a
add 1, a -&amp;gt; d
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;There are two definitions of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt; and they each live for different amounts of
time:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;                  a  b  c  a  d
...      -&amp;gt; a     |                &amp;lt;- the first a
add 1, a -&amp;gt; b     v  |
add 1, b -&amp;gt; c        v  |
add 1, c -&amp;gt; a           v  |       &amp;lt;- the second a
add 1, a -&amp;gt; d              v  |
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;In fact, the ranges are completely disjoint. It wouldn’t make sense for the
register allocator to consider variables, because there’s no reason the two
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;a&lt;/code&gt;s should necessarily live in the same physical register.&lt;/p&gt;

&lt;p&gt;In SSA land, it’s a little different: since each virtual registers only has one
definition (by, uh, definition), live ranges are an exact 1:1 mapping with
virtual registers. &lt;strong&gt;We’ll focus on SSA for the remainder of the post because
this is what I am currently interested in.&lt;/strong&gt; The research community seems to
have decided that allocating directly on SSA gives more information to the
register allocator&lt;sup id=&quot;fnref:allocate-on-ssa&quot;&gt;&lt;a href=&quot;#fn:allocate-on-ssa&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;p&gt;Linear scan starts at the point in your compiler process where you already know
these live ranges—that you have already done some kind of analysis to build a
mapping.&lt;/p&gt;

&lt;p&gt;In this blog post, we’re going to back up to the point where we’ve just built
our SSA low-level IR and have yet to do any work on it. We’ll do all of the
analysis from scratch.&lt;/p&gt;

&lt;p&gt;Part of this analysis is called &lt;em&gt;liveness analysis&lt;/em&gt;.&lt;/p&gt;

&lt;h2 id=&quot;liveness-analysis&quot;&gt;Liveness analysis&lt;/h2&gt;

&lt;p&gt;The result of liveness analysis is a mapping of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;BasicBlock -&amp;gt;
Set[Instruction]&lt;/code&gt; that tells you which virtual registers (remember, since we’re
in SSA, instruction==vreg) are alive (used later) at the beginning of the basic
block. This is called a &lt;em&gt;live-in&lt;/em&gt; set. For example:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;B0:
... -&amp;gt; R12
... -&amp;gt; R13
jmp B1

B1:
mul R12, R13 -&amp;gt; R14
sub R13, 1 -&amp;gt; R15
jmp B2

B2:
add R14, R15 -&amp;gt; R16
ret R16
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We compute liveness by working backwards: a variable is &lt;em&gt;live&lt;/em&gt; from the moment
it is backwardly-first used until its definition.&lt;/p&gt;

&lt;p&gt;In this case, at the end of B2, nothing is live. If we step backwards to the
&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;ret&lt;/code&gt;, we see a use: R16 becomes live. If we step once more, we see its
definition—R16 no longer live—but now we see a use of R14 and R15, which
become live. This leaves us with R14 and R15 being &lt;em&gt;live-in&lt;/em&gt; to B2.&lt;/p&gt;

&lt;p&gt;This live-in set becomes B1’s &lt;em&gt;live-out&lt;/em&gt; set because B1 is B2’s predecessor. We
continue in B1. We could continue backwards linearly through the blocks. In
fact, I encourage you to do it as an exercise. You should have a (potentially
emtpy) set of registers per basic block.&lt;/p&gt;

&lt;p&gt;It gets more interesting, though, when we have branches: what does it mean when
two blocks’ live-in results merge into their shared predecessor? If we have two
blocks A and B that are successors of a block C, the live-in sets get
&lt;em&gt;unioned&lt;/em&gt; together.&lt;/p&gt;

&lt;!--
digraph G {
  node [shape=square];
  C -&gt; A;
  C -&gt; B;
}
--&gt;
&lt;figure&gt;
&lt;svg xmlns=&quot;http://www.w3.org/2000/svg&quot; xmlns:xlink=&quot;http://www.w3.org/1999/xlink&quot; width=&quot;98pt&quot; height=&quot;116pt&quot; viewBox=&quot;0.00 0.00 98.00 116.00&quot;&gt;
&lt;g id=&quot;graph0&quot; class=&quot;graph&quot; transform=&quot;scale(1 1) rotate(0) translate(4 112)&quot;&gt;
&lt;title&gt;G&lt;/title&gt;
&lt;polygon fill=&quot;white&quot; stroke=&quot;none&quot; points=&quot;-4,4 -4,-112 94,-112 94,4 -4,4&quot; /&gt;
&lt;!-- C --&gt;
&lt;g id=&quot;node1&quot; class=&quot;node&quot;&gt;
&lt;title&gt;C&lt;/title&gt;
&lt;polygon fill=&quot;none&quot; stroke=&quot;black&quot; points=&quot;63,-108 27,-108 27,-72 63,-72 63,-108&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;45&quot; y=&quot;-85.8&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;C&lt;/text&gt;
&lt;/g&gt;
&lt;!-- A --&gt;
&lt;g id=&quot;node2&quot; class=&quot;node&quot;&gt;
&lt;title&gt;A&lt;/title&gt;
&lt;polygon fill=&quot;none&quot; stroke=&quot;black&quot; points=&quot;36,-36 0,-36 0,0 36,0 36,-36&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;18&quot; y=&quot;-13.8&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;A&lt;/text&gt;
&lt;/g&gt;
&lt;!-- C&amp;#45;&amp;gt;A --&gt;
&lt;g id=&quot;edge1&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;C-&amp;gt;A&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M38.33,-71.7C35.42,-64.15 31.93,-55.12 28.68,-46.68&quot; /&gt;
&lt;polygon fill=&quot;black&quot; stroke=&quot;black&quot; points=&quot;32.01,-45.59 25.14,-37.52 25.48,-48.11 32.01,-45.59&quot; /&gt;
&lt;/g&gt;
&lt;!-- B --&gt;
&lt;g id=&quot;node3&quot; class=&quot;node&quot;&gt;
&lt;title&gt;B&lt;/title&gt;
&lt;polygon fill=&quot;none&quot; stroke=&quot;black&quot; points=&quot;90,-36 54,-36 54,0 90,0 90,-36&quot; /&gt;
&lt;text text-anchor=&quot;middle&quot; x=&quot;72&quot; y=&quot;-13.8&quot; font-family=&quot;Times,serif&quot; font-size=&quot;14.00&quot;&gt;B&lt;/text&gt;
&lt;/g&gt;
&lt;!-- C&amp;#45;&amp;gt;B --&gt;
&lt;g id=&quot;edge2&quot; class=&quot;edge&quot;&gt;
&lt;title&gt;C-&amp;gt;B&lt;/title&gt;
&lt;path fill=&quot;none&quot; stroke=&quot;black&quot; d=&quot;M51.67,-71.7C54.58,-64.15 58.07,-55.12 61.32,-46.68&quot; /&gt;
&lt;polygon fill=&quot;black&quot; stroke=&quot;black&quot; points=&quot;64.52,-48.11 64.86,-37.52 57.99,-45.59 64.52,-48.11&quot; /&gt;
&lt;/g&gt;
&lt;/g&gt;
&lt;/svg&gt;
&lt;/figure&gt;

&lt;p&gt;That is, if there were some register R0 live-in to B and some register R1
live-in to A, both R0 and R1 would be live-out of C. They may also be live-in
to C, but that entirely depends on the contents of C.&lt;/p&gt;

&lt;p&gt;Since the total number of virtual registers is nonnegative and is finite for a
given program, it seems like a good lattice for an &lt;em&gt;abstract interpreter&lt;/em&gt;.
That’s right, we’re doing AI.&lt;/p&gt;

&lt;p&gt;In this liveness analysis, we’ll:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;compute a summary of what virtual registers each basic block needs to be
alive (gen set) and what variables it defines (kill set)&lt;/li&gt;
  &lt;li&gt;initialize all live-in sets to 0&lt;/li&gt;
  &lt;li&gt;do an iterative dataflow analysis over the blocks until the live-in sets
converge&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We store gen, kill, and live-in sets as bitsets, using some APIs conveniently
available on Ruby’s Integer class.&lt;/p&gt;

&lt;p&gt;Like most abstract interpretations, it doesn’t matter what order we iterate
over the collection of basic blocks for correctness, but it &lt;em&gt;does&lt;/em&gt; matter for
performance. In this case, iterating backwards (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;post_order&lt;/code&gt;) converges much
faster than forwards (&lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;reverse_post_order&lt;/code&gt;):&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Function&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;compute_initial_liveness_sets&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;order&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Map of Block -&amp;gt; what variables it alone needs to be live-in&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Map of Block -&amp;gt; what variables it alone defines&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;instructions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reverse_each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as_vreg&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;vreg_ins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vreg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vreg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;parameters&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;param&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;param&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;num&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;analyze_liveness&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;order&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;post_order&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;compute_initial_liveness_sets&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Map from Block -&amp;gt; what variables are live-in&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;changed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;true&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;changed&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;changed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;false&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;order&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# Union-ing all the successors&apos; live-in sets gives us this block&apos;s&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# live-out, which is a good starting point for computing the live-in&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;successors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;succ&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;succ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gen&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;~&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;changed&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;true&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block_live&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;We could also use a worklist here, and it would be faster, but eh. Repeatedly
iterating over all blocks is fine for now.&lt;/p&gt;

&lt;p&gt;The Wimmer2010 paper skips this liveness analysis entirely by assuming some
computed information about your CFG: where loops start and end. It also
requires all loop blocks be contiguous. Then it makes variables defined before
a loop and used at any point inside the loop live &lt;em&gt;for the whole loop&lt;/em&gt;. By
having this information available, it folds the liveness analysis into the live
range building, which we’ll instead do separately in a moment.&lt;/p&gt;

&lt;p&gt;The Wimmer approach sounded complicated and finicky. Maybe it is, maybe it
isn’t. So I went with a dataflow liveness analysis instead. If it turns out to
be the slow part, maybe it will matter enough to learn about this loop tagging
method.&lt;/p&gt;

&lt;p&gt;For now, we will pick a &lt;em&gt;schedule&lt;/em&gt; for the control-flow graph.&lt;/p&gt;

&lt;h2 id=&quot;scheduling&quot;&gt;Scheduling&lt;/h2&gt;

&lt;p&gt;In order to build live ranges, you have to have some kind of numbering system
for your instructions, otherwise a live range’s start and end are meaningless.
We can write a function that fixes a particular block order (in this case,
reverse post-order) and then assigns each block and instruction a number in a
linear sequence. You can think of this as flattening or projecting the graph:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Function&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;number_instructions!&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@block_order&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rpo&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# just so we match the Wimmer2010 paper&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@block_order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;blk&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;blk&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;blk&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;instructions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;blk&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;number&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;!--
I think using RPO is just a heuristic; other block orders may shrink live
ranges, reduce parallel moves, etc. I could be way off base here but I don&apos;t
think we even have to order the blocks in dominance order because we&apos;re doing a
full dataflow-based liveness analysis; the ordering and positioning
requirements from Wimmer2010 come from the quick liveness with loop headers
thing.
--&gt;

&lt;p&gt;A couple interesting things to note:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;We number blocks because we use block starts as the start index for all of
that block’s parameters&lt;/li&gt;
  &lt;li&gt;We start numbering at 16 just so we can eyeball things and make sure they
line up with the Wimmer2010 paper&lt;/li&gt;
  &lt;li&gt;We only give out even numbers because later we’ll insert loads and stores at
odd-numbered instructions
    &lt;ul&gt;
      &lt;li&gt;Cinder does this to &lt;a href=&quot;https://github.com/facebookincubator/cinderx/blob/2b8774f077d6ef441207067411d157bb4f94a40b/cinderx/Jit/lir/regalloc.cpp#L243&quot;&gt;separately identify instruction input and instruction output&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;Vox &lt;a href=&quot;https://github.com/MrSmith33/vox/blob/b49cc734d6e5119e20229ee2d14612e33c6a5499/source/vox/be/reg_alloc/linear_scan.d#L10&quot;&gt;splits only at odd positions&lt;/a&gt;&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even though we have extra instructions, it looks very similar to the example in
the Wimmer2010 paper.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;16: label B1(R10, R11):
18: jmp B2($1, R11)
     # vvvvvvvvvv #
20: label B2(R12, R13)
22: cmp R13, $1
24: branch lessThan B4() else B3()

26: label B3()
28: mul R12, R13 -&amp;gt; R14
30: sub R13, $1 -&amp;gt; R15
32: jump B2(R14, R15)

34: label B4()
     # ^^^^^^^^^^ #
36: add R10, R12 -&amp;gt; R16
38: ret R16
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Since we’re not going to be messing with the order of the instructions within a
block anymore, all we have to do going forward is make sure that we iterate
through the blocks in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@block_order&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Finally, we have all that we need to compute live ranges.&lt;/p&gt;

&lt;h2 id=&quot;live-ranges&quot;&gt;Live ranges&lt;/h2&gt;

&lt;p&gt;We’ll more or less copy the algorithm to compute live ranges from the
Wimmer2010 paper. We’ll have two main differences:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;We’re going to compute live ranges, not live intervals (as they do in the
paper)&lt;/li&gt;
  &lt;li&gt;We’re going to use our dataflow liveness analysis, not the loop header thing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I know I said we were going to be computing live ranges. So why am I presenting
you with a function called &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;build_intervals&lt;/code&gt;? That’s because early in
the history of linear scan (Traub1998!), people moved from having a single range for a
particular virtual register to having &lt;em&gt;multiple&lt;/em&gt; disjoint ranges. This
collection of multiple ranges is called an &lt;em&gt;interval&lt;/em&gt; and it exists to free up
registers in the context of branches.&lt;/p&gt;

&lt;p&gt;For example, in the our IR snippet (above), R12 is defined in B2 as a block
parameter, used in B3, and then not used again until some indetermine point in
B4. (Our example uses it immediately in an add instruction to keep things
short, but pretend the second use is some time away.)&lt;/p&gt;

&lt;p&gt;The Wimmer2010 paper creates a &lt;em&gt;lifetime hole&lt;/em&gt; between 28 and 34, meaning that the
interval for R12 (called i12) is &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[[20, 28), [34, ...)]&lt;/code&gt;. Interval holes are
not strictly necessary—they exist to generate better code. So for this post,
we’re going to start simple and assume 1 interval == 1 range. We may come back
later and add additional ranges, but that will require some fixes to our later
implementation. We’ll note where we think those fixes should happen.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;BUILDINTERVALS
for each block b in reverse order do
  live = union of successor.liveIn for each successor of b
  for each phi function phi of successors of b do
    live.add(phi.inputOf(b))
  for each opd in live do
    intervals[opd].addRange(b.from, b.to)
  for each operation op of b in reverse order do
    for each output operand opd of op do
      intervals[opd].setFrom(op.id)
      live.remove(opd)
    for each input operand opd of op do
      intervals[opd].addRange(b.from, op.id)
      live.add(opd)
  for each phi function phi of b do
    live.remove(phi.output)
  if b is loop header then
    loopEnd = last block of the loop starting at b
    for each opd in live do
      intervals[opd].addRange(b.from, loopEnd.to)
  b.liveIn = live
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Anyway, here is the mostly-copied annotated implementation of BuildIntervals
from the Wimmer2010 paper:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Function&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;build_intervals&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;key&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@block_order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reverse_each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# live = union of successor.liveIn for each successor of b&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# this is the *live out* of the current block since we&apos;re going to be&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# iterating backwards over instructions&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;live&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;successors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;succ&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;live_in&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;succ&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# for each phi function phi of successors of b do&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;#   live.add(phi.inputOf(b))&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;live&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;out_vregs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vreg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vreg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;num&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reduce&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;each_bit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;live&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;opd&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vreg&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;idx&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;instructions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reverse&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;as_vreg&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# for each output operand opd of op do&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;#   intervals[opd].setFrom(op.id)&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;set_from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# for each input operand opd of op do&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;#   intervals[opd].addRange(b.from, op.id)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;vreg_ins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opd&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;opd&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;default_proc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;nil&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;freeze&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Another difference is that since we’re using block parameters, we don’t really
have this &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;phi.inputOf&lt;/code&gt; thing. That’s just the block argument.&lt;/p&gt;

&lt;p&gt;The last difference is that since we’re skipping the loop liveness hack, we
don’t modify a block’s &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;live&lt;/code&gt; set as we iterate through instructions.&lt;/p&gt;

&lt;p&gt;I know we said we’re building live ranges, so our &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Interval&lt;/code&gt; class only has
one &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Range&lt;/code&gt; on it. This is Ruby’s built-in range, but it’s really just being
used as a tuple of integers here.&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Interval&lt;/span&gt;
  &lt;span class=&quot;nb&quot;&gt;attr_reader&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:range&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;add_range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;ArgumentError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Invalid range: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; to &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt;
      &lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;begin&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;min&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;set_from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;ArgumentError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Invalid range: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; to &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;no&quot;&gt;Range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# This happens when we don&apos;t have a use of the vreg&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# If we don&apos;t have a use, the live range is very short&lt;/span&gt;
      &lt;span class=&quot;no&quot;&gt;Range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;==&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is_a?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;vi&quot;&gt;@range&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;other&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that there’s some implicit behavior happening here:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;If we haven’t initialized a range yet, we build one automatically&lt;/li&gt;
  &lt;li&gt;If we have a range, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add_range&lt;/code&gt; builds the smallest range that overlaps with
the existing range and incoming information&lt;/li&gt;
  &lt;li&gt;If we have a range, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;set_from&lt;/code&gt; may shrink it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For example, if we have &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1, 5)&lt;/code&gt; and someone calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add_range(7, 10)&lt;/code&gt;, we end
up with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1, 10)&lt;/code&gt;. There’s no gap in the middle.&lt;/p&gt;

&lt;p&gt;And if we have &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[1, 7)&lt;/code&gt; and someone calls &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;set_from(3)&lt;/code&gt;, we end up with &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;[3,
7)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;After figuring out from scratch some of these assumptions about what the
interval/range API should and should not do, Aaron and I realized that there
was some actual code for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;add_range&lt;/code&gt; in a different, earlier paper: &lt;a href=&quot;/assets/img/linear-scan-ra-context-ssa.pdf&quot;&gt;Linear
Scan Register Allocation in the Context of SSA Form and Register
Constraints&lt;/a&gt; (PDF, 2002) by Mössenböck and Pfeiffer.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;ADDRANGE(i: Instruction; b: Block; end: integer)
  if b.first.n ≤ i.n ≤ b.last.n then range ← [i.n, end[ else range ← [b.first.n, end[
  add range to interval[i.n] // merging adjacent ranges
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Unfortunately, many other versions of this PDF look absolutely horrible (like
bad OCR) and I had to do some digging to find the version linked above.&lt;/p&gt;

&lt;p&gt;Finally we can start thinking about doing some actual register assignment.
Let’s return to the 90s.&lt;/p&gt;

&lt;h2 id=&quot;linear-scan&quot;&gt;Linear scan&lt;/h2&gt;

&lt;p&gt;Because we have faithfully kept 1 interval == 1 range, we can re-use the linear
scan algorithm from Poletto1999 (which looks, at a glance, to be the same
as 1997).&lt;/p&gt;

&lt;p&gt;I recommend looking at the PDF side by side with the code. We have tried to
keep the structure very similar.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;LinearScanRegisterAllocation
active ← {}
foreach live interval i, in order of increasing start point
  ExpireOldIntervals(i)
  if length(active) = R then
    SpillAtInterval(i)
  else
    register[i] ← a register removed from pool of free registers
    add i to active, sorted by increasing end point

ExpireOldIntervals(i)
foreach interval j in active, in order of increasing end point
  if endpoint[j] ≥ startpoint[i] then
    return
  remove j from active
  add register[j] to pool of free registers

SpillAtInterval(i)
spill ← last interval in active
if endpoint[spill] &amp;gt; endpoint[i] then
  register[i] ← register[spill]
  location[spill] ← new stack location
  remove spill from active
  add i to active, sorted by increasing end point
else
  location[i] ← new stack location
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Note that unlike in many programming languages these days, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{}&lt;/code&gt; in the
algorithm description represents a &lt;em&gt;set&lt;/em&gt;, not a (hash-)map.&lt;/p&gt;

&lt;p&gt;In our Ruby code, we represent &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;active&lt;/code&gt; as an array:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Function&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;ye_olde_linear_scan&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_registers&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_registers&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;ArgumentError&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Number of registers must be positive&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;free_registers&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Set&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_registers&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Active intervals, sorted by increasing end point&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;assignment&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Map from Interval to PReg|StackSlot&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;num_stack_slots&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Iterate through intervals in order of increasing start point&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# TODO(max): Build a deque for intervals, pushing to the front, so we&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# automatically get this in sorted order&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sorted_intervals&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;sort_by&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;begin&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sorted_intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_vreg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# expire_old_intervals(interval)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;select!&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;active_interval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;active_interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;begin&lt;/span&gt;
          &lt;span class=&quot;kp&quot;&gt;true&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;operand&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;assignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;fetch&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;active_interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Should be assigned a register&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;unless&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;operand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is_a?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PReg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;free_registers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;add&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;operand&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;kp&quot;&gt;false&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_registers&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# spill_at_interval(interval)&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# Pick an interval to spill. Picking the longest-lived active one is&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# a heuristic from the original linear scan paper.&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;spill&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;last&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# In either case, we need to allocate a slot on the stack.&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;slot&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;StackSlot&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;num_stack_slots&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;num_stack_slots&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;spill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# The last active interval ends further away than the current&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# interval; spill the last active interval.&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;assignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;assignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
          &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Should be assigned a register&quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;unless&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;assignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is_a?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PReg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;assignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;spill&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;slot&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;pop&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# We know spill is the last one&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# Insert interval into already-sorted active&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;insert_idx&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;bsearch_index&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;insert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insert_idx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# The current interval ends further away than the last active&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# interval; spill the current interval.&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;assignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;slot&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;reg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;free_registers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;min&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;free_registers&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;delete&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;assignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;PReg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;reg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# Insert interval into already-sorted active&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;insert_idx&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;bsearch_index&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;range&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;end&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;active&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;insert&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insert_idx&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;assignment&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_stack_slots&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Internalizing this took us a bit. It is mostly a three-state machine:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;have not been allocated&lt;/li&gt;
  &lt;li&gt;have been allocated a register&lt;/li&gt;
  &lt;li&gt;have been allocated a stack slot&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We would like to come back to this and incrementally modify it as we add
lifetime holes to intervals.&lt;/p&gt;

&lt;p&gt;I finally understood, very late in the game, that Poletto1999 linear scan assigns only one
location per virtual register. &lt;em&gt;Ever&lt;/em&gt;. It’s not that every virtual register
gets a shot in a register and then gets moved to a stack slot—that would be
interval splitting and hopefully we get to that later—if a register gets
spilled, it’s in a stack slot from beginning to end.&lt;/p&gt;

&lt;p&gt;I only found this out accidentally after trying to figure out a bug (that
wasn’t a bug) due to a lovely sentence in &lt;a href=&quot;/assets/img/optimized-interval-splitting-linear-scan-ra.pdf&quot;&gt;Optimized Interval Splitting in a
Linear Scan Register Allocator&lt;/a&gt; (PDF, 2005) by
Wimmer and Mössenböck):&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;However, it cannot deal with lifetime holes and does not split intervals, so
an interval has either a register assigned for the whole lifetime, or it is
spilled completely.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Also,&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;In particular, it is not possible to implement the algorithm without
reserving a scratch register: When a spilled interval is used by an
instruction requiring the operand in a register, the interval must be
temporarily reloaded to the scratch register&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Also,&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Additionally, register constraints for method calls and instructions
requiring fixed registers must be handled separately&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Marvelous.&lt;/p&gt;

&lt;p&gt;Let’s take a look at the code snippet again. Here it is before register
allocation, using virtual registers:&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;16: label B1(R10, R11):
18: jmp B2($1, R11)
     # vvvvvvvvvv #
20: label B2(R12, R13)
22: cmp R13, $1
24: branch lessThan B4()

26: label B3()
28: mul R12, R13 -&amp;gt; R14
30: sub R13, $1 -&amp;gt; R15
32: jump B2(R14, R15)

34: label B4()
     # ^^^^^^^^^^ #
36: add R10, R12 -&amp;gt; R16
38: ret R16
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Let’s run it through register allocation with incrementally decreasing numbers
of physical registers available. We get the following assignments:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;4 registers &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{R10: P0, R11: P1, R12: P1, R13: P2, R14: P3, R15: P2, R16: P0}&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;3 registers &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{R10: Stack[0], R11: P1, R12: P1, R13: P2, R14: P0, R15: P2, R16: P0}&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;2 registers &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{R10: Stack[0], R11: P1, R12: Stack[1], R13: P0, R14: P1, R15: P0, R16: P0}&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;1 register &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;{R10: Stack[0], R11: P0, R12: Stack[1], R13: P0, R14: Stack[2], R15: P0, R16: P0}&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Some other things to note:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;If you have a register free, choosing which register to allocate is a
heuristic! It is tunable. There is probably some research out there that
explores the space.&lt;/p&gt;

    &lt;p&gt;In fact, you might even consider &lt;em&gt;not&lt;/em&gt; allocating a register greedily. What
might that look like? I have no idea.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Spilling the interval with the furthest endpoint is a heuristic! You can
pick any active interval you want. In &lt;a href=&quot;/assets/img/register-spilling-range-splitting-ssa.pdf&quot;&gt;Register Spilling and Live-Range
Splitting for SSA-Form Programs&lt;/a&gt; (PDF,
2009) by Braun and Hack, for example, they present the MIN algorithm, which
spills the interval with the furthest next use.&lt;/p&gt;

    &lt;p&gt;This requires slightly more information and takes slightly more time than
the default heuristic but apparently generates much better code.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Also, block ordering? You guessed it. Heuristic.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is an example “slideshow” I generated by running linear scan with 2
registers. Use the arrow keys to navigate forward and backward in time&lt;sup id=&quot;fnref:rsms&quot;&gt;&lt;a href=&quot;#fn:rsms&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;4&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;

&lt;iframe src=&quot;/assets/lsra.html&quot; width=&quot;100%&quot; onload=&quot;this.style.height = this.contentWindow.document.documentElement.scrollHeight + &apos;px&apos;;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;resolving-ssa&quot;&gt;Resolving SSA&lt;/h2&gt;

&lt;p&gt;At this point we have register &lt;em&gt;assignments&lt;/em&gt;: we have a hash table mapping
intervals to physical locations. That’s great but we’re still in SSA form:
labelled code regions don’t have block arguments in hardware. We need to write
some code to take us out of SSA and into the real world.&lt;/p&gt;

&lt;p&gt;We can use a modified Wimmer2010 as a great start point here. It handles more
than we need to right now—interval splitting—but we can simplify.&lt;/p&gt;

&lt;div class=&quot;language-plaintext highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;RESOLVE
for each control flow edge from predecessor to successor do
  for each interval it live at begin of successor do
    if it starts at begin of successor then
      phi = phi function defining it
      opd = phi.inputOf(predecessor)
      if opd is a constant then
        moveFrom = opd
      else
        moveFrom = location of intervals[opd] at end of predecessor
    else
      moveFrom = location of it at end of predecessor
    moveTo = location of it at begin of successor
    if moveFrom ≠ moveTo then
      mapping.add(moveFrom, moveTo)
  mapping.orderAndInsertMoves()
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Because we don’t split intervals, we know that every interval live at the
beginning of a block is either:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;live across an edge between two blocks and therefore has already been placed
in a location by assignment/spill code&lt;/li&gt;
  &lt;li&gt;beginning its life at the beginning of the block as a block parameter and
therefore needs to be moved from its source location&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For this reason, we only handle the second case in our SSA resolution. If we
added &lt;del&gt;lifetime holes&lt;/del&gt; interval splitting, we would have to go back to the
full Wimmer SSA resolution.&lt;/p&gt;

&lt;p&gt;This means that we’re going to iterate over every outbound edge from every
block. For each edge, we’re going to insert some parallel moves.&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Function&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;resolve_ssa&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;assignments&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@block_order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;predecessor&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;outgoing_edges&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;predecessor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;edges&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;num_successors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;outgoing_edges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;outgoing_edges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;mapping&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;successor&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;block&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;successor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;parameters&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;moveFrom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;moveTo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
          &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;moveFrom&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;moveTo&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;mapping&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;moveFrom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;moveTo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
          &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# predecessor.order_and_insert_moves(mapping)&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# TODO: order_and_insert_moves&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Remove all block parameters and arguments; we have resolved SSA&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@block_order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;parameters&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;clear&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;edges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;args&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;clear&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;This already looks very similar to the RESOLVE function from Wimmer2010.
Unfortunately, Wimmer2010 basically shrugs off &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;orderAndInsertMoves&lt;/code&gt; with an &lt;em&gt;eh, it’s
already in the literature&lt;/em&gt; comment.&lt;/p&gt;

&lt;h3 id=&quot;a-brief-and-frustrating-parallel-moves-detour&quot;&gt;A brief and frustrating parallel moves detour&lt;/h3&gt;

&lt;p&gt;What’s not made clear, though, is that this particular subroutine has been the
source of a significant amount of bugs in the literature. Only recently did
some folks roll through and suggest (proven!) fixes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/assets/img/parallel-move-leroy.pdf&quot;&gt;Battling windmills with Coq: formal verification of a compilation algorithm
for parallel moves&lt;/a&gt; (PDF, 2007) by Rideau, Serpette, and
Leroy&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;/assets/img/boissinot-out-ssa.pdf&quot;&gt;Revisiting Out-of-SSA Translation for Correctness, Code Quality, and
Efficiency&lt;/a&gt; (PDF, 2009) by Boissinot, Darte, Rastello,
Dupont de Dinechin, and Guillon.
    &lt;ul&gt;
      &lt;li&gt;and again in &lt;a href=&quot;/assets/img/boissinot-thesis.pdf&quot;&gt;Boissinot’s thesis&lt;/a&gt; (PDF, 2010)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This sent us on a deep rabbit hole of trying to understand what bugs occur,
when, and how to fix them. We implemented both the Leroy and the Boissinot
algorithms. We found differences between Boissinot2009, Boissinot2010, and the
SSA book implementation following those algorithms. We found Paul Sokolovsky’s
&lt;a href=&quot;https://github.com/pfalcon/parcopy/&quot;&gt;implementation with bugfixes&lt;/a&gt;. We found
Dmitry Stogov’s &lt;a href=&quot;https://github.com/pfalcon/parcopy/pull/1&quot;&gt;unmerged pull
request&lt;/a&gt; to the same repository to
fix another bug.&lt;/p&gt;

&lt;p&gt;We looked at Benoit Boissinot’s thesis again and emailed him some questions. He
responded! And then he even put up an &lt;a href=&quot;https://github.com/bboissin/thesis_bboissin&quot;&gt;amended version of his
algorithm&lt;/a&gt; in Rust with tests and
fuzzing.&lt;/p&gt;

&lt;p&gt;All this is to say that this is still causing people grief and, though I
understand page limits, I wish parallel moves were not handwaved away.&lt;/p&gt;

&lt;p&gt;We ended up with this implementation which passes all of the tests from
Sokolovsky’s repository as well as the example from Boissinot’s thesis (though,
as we discussed in the email, the example solution in the thesis is
incorrect&lt;sup id=&quot;fnref:thesis-correction&quot;&gt;&lt;a href=&quot;#fn:thesis-correction&quot; class=&quot;footnote&quot; rel=&quot;footnote&quot; role=&quot;doc-noteref&quot;&gt;5&lt;/a&gt;&lt;/sup&gt;).&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c1&quot;&gt;# copies contains an array of [src, dst] arrays&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;sequentialize&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;copies&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;ready&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Contains only destination regs (&quot;available&quot;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;to_do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Contains only destination regs&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Map of destination reg -&amp;gt; what reg is written to it (its source)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;loc&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{}&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Map of reg -&amp;gt; the current location where the initial value of reg is available (&quot;resource&quot;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;emit_copy&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# We add an arrow here just for clarity in reading this algorithm because&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# different people do [src, dst] and [dst, src] depending on if they prefer&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Intel or AT&amp;amp;T&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;-&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

  &lt;span class=&quot;c1&quot;&gt;# In Ruby, loc[x] is nil if x not in loc, so this loop could be omitted&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;copies&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;loc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;nil&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;copies&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;loc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;key?&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# Alternatively, to_do.include? dst&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;Conflicting assignments to destination &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;, latest: &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;#{&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;to_do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;n&quot;&gt;copies&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;loc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# All destinations that are not also sources can be written to immediately (tree leaves)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;ready&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

  &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;empty?&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;while&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ready&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;pop&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;loc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt; &lt;span class=&quot;c1&quot;&gt;# a in the paper&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;emit_copy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;c1&quot;&gt;# pred[b] is now living at b&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;loc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to_do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;include?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;to_do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;delete&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;include?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;ready&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to_do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;empty?&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;break&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;to_do&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;pop&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;!=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;loc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;emit_copy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;tmp&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;loc&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;tmp&quot;&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;ready&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Leroy’s algorithm, which is shorter, passes almost all the tests—in one test
case, it uses one more temporary variable than Boissinot’s does. We haven’t
spent much time looking at why.&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;move_one&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:being_moved&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;case&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:to_move&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;move_one&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;when&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:being_moved&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;-&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;tmp&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;tmp&quot;&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&quot;-&amp;gt;&quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]]&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:moved&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;leroy_sequentialize&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;copies&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;copies&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;copies&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;it&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:to_move&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[]&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each_with_index&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;item&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;item&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:to_move&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;move_one&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;status&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;result&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;h3 id=&quot;back-to-ssa-resolution&quot;&gt;Back to SSA resolution&lt;/h3&gt;

&lt;p&gt;Whatever algorithm you choose, you now have a way to parallel move some
registers to some other registers. You have avoided the “swap problem”.&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Function&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;resolve_ssa&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;assignments&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# predecessor.order_and_insert_moves(mapping)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sequentialize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mapping&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
          &lt;span class=&quot;no&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:mov&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# TODO: insert the moves!&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;That’s great. You can generate an ordered list of instructions from a tangled
graph. But where do you put them? What about the “lost copy” problem?&lt;/p&gt;

&lt;p&gt;As it turns out, we still need to handle critical edge splitting. Let’s
consider what it means to insert moves at an edge between blocks &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; B&lt;/code&gt; when
the surrounding CFG looks a couple of different ways.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Case 1: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; B&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Case 2: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; B&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; C&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Case 3: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; B&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;D -&amp;gt; B&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Case 4: &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; B&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; C&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;D -&amp;gt; B&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are the four (really, three) cases we may come across.&lt;/p&gt;

&lt;p&gt;In Case 1, if we only have two neighboring blocks A and B, we can
insert the moves into either block. It doesn’t matter: at the end of A or at
the beginning of B are both fine.&lt;/p&gt;

&lt;p&gt;In Case 2, if A has two successors, then we should insert the moves at the
beginning of B. That way we won’t be mucking things up for the edge &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; C&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In Case 3, if B has two predecessors, then we should insert the moves at the
end of A. That way we won’t be mucking things up for the edge &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;D -&amp;gt; B&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Case 4 is the most complicated. There is no extant place in the graph we can
insert moves. If we insert in A, we mess things up for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; C&lt;/code&gt;. If we insert
in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;B&lt;/code&gt;, we mess things up for &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;D -&amp;gt; B&lt;/code&gt;. Inserting in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;C&lt;/code&gt; or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;D&lt;/code&gt; doesn’t make
any sense. What is there to do?&lt;/p&gt;

&lt;p&gt;As it turns out, Case 4 is called a &lt;em&gt;critical edge&lt;/em&gt;. And we have to split it.&lt;/p&gt;

&lt;p&gt;We can insert a new block E along the edge &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;A -&amp;gt; B&lt;/code&gt; and put the moves in E!
That way they still happen along the edge without affecting any other blocks.
Neat.&lt;/p&gt;

&lt;p&gt;In Ruby code, that looks like:&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Function&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;resolve_ssa&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;assignments&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;num_predecessors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Hash&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@block_order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;edges&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;num_predecessors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# predecessor.order_and_insert_moves(mapping)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# If we don&apos;t have any moves to insert, we don&apos;t have any block to&lt;/span&gt;
        &lt;span class=&quot;c1&quot;&gt;# insert&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;next&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;empty?&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_predecessors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;successor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_successors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# Make a new interstitial block&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;new_block&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;insert_moves_at_start&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;instructions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:jump&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;successor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[])])&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;edge&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;block&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;elsif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;num_successors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# Insert into the beginning of the block&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;successor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;insert_moves_at_start&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
          &lt;span class=&quot;c1&quot;&gt;# Insert into the end of the block... before the terminator&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;predecessor&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;insert_moves_at_end&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Adding a new block invalidates the cached &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;@block_order&lt;/code&gt;, so we also need to
recompute that.&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;We could also avoid that by splitting critical edges earlier, before
numbering. Then, when we arrive in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;resolve_ssa&lt;/code&gt;, we can clean up branches to
empty blocks!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;(See also &lt;a href=&quot;https://nickdesaulniers.github.io/blog/2023/01/27/critical-edge-splitting/&quot;&gt;Nick’s post on critical edge
splitting&lt;/a&gt;,
which also links to Faddegon’s thesis, which I should at least skim.)&lt;/p&gt;

&lt;p&gt;And that’s it, folks. We have gone from virtual registers in SSA form to
physical locations. Everything’s all hunky-dory. We can just turn these LIR
instructions into their very similar looking machine equivalents, right?&lt;/p&gt;

&lt;p&gt;Not so fast…&lt;/p&gt;

&lt;h2 id=&quot;calls&quot;&gt;Calls&lt;/h2&gt;

&lt;p&gt;You may have noticed that the original linear scan paper does not mention calls
or other register constraints. I didn’t really think about it until I wanted to
make a function call. The authors of later linear scan papers definitely
noticed, though; Wimmer2005 writes the following about Poletto1999:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;When a spilled interval is used by an instruction requiring the operand in a
register, the interval must be temporarily reloaded to the scratch register.
Additionally, register constraints for method calls and instructions
requiring fixed registers must be handled separately.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Fun. We will start off by handling calls and method parameters separately, we
will note that it’s not amazing code, and then we will eventually implement the
later papers, which handle register constraints more naturally.&lt;/p&gt;

&lt;p&gt;We’ll call this new function &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;handle_caller_saved_regs&lt;/code&gt; after register
allocation but before SSA resolution. We do it after register allocation so we
know where each virtual register goes but before resolution so we can still
inspect the virtual register operands.&lt;/p&gt;

&lt;p&gt;Its goal is to do a couple of things:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Insert special &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;push&lt;/code&gt; and &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;pop&lt;/code&gt; instructions around &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;call&lt;/code&gt; instructions to
preserve virtual registers that are used on the other side of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;call&lt;/code&gt;. We
only care about preserving virtual registers that are stored in physical
registers, though; no need to preserve anything that already lives on the
stack.&lt;/li&gt;
  &lt;li&gt;Do a parallel move of the call arguments into the ABI-specified parameter
registers. We need to do a parallel move in case any of the arguments happen
to already be living in parameter registers. (We’re really getting good
mileage out of this function.)&lt;/li&gt;
  &lt;li&gt;Make sure that the value returned by the call in the ABI-specified return
register ends up in in the location allocated to the output of the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;call&lt;/code&gt;
instruction.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We’ll also remove the &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;call&lt;/code&gt; operands since we’re placing them in special
registers explicitly now.&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Function&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;handle_caller_saved_regs&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;assignments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;return_reg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;param_regs&lt;/span&gt;
    &lt;span class=&quot;vi&quot;&gt;@block_order&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;each&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;instructions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;flat_map&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;name&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:call&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;survivors&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;select&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;_vreg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;interval&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;survives?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;number&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;select&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vreg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;assignments&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vreg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]].&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;is_a?&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;PReg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;mov_input&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;out&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;return_reg&lt;/span&gt;

          &lt;span class=&quot;n&quot;&gt;ins&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;ins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;drop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
          &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;param_regs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;length&lt;/span&gt;

          &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;ins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;ins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;first&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;

          &lt;span class=&quot;n&quot;&gt;mapping&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ins&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;param_regs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to_h&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sequentialize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mapping&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
            &lt;span class=&quot;no&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:mov&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
          &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

          &lt;span class=&quot;n&quot;&gt;survivors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:push&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:mov&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mov_input&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;return_reg&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;survivors&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:pop&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kp&quot;&gt;nil&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;}.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;reverse&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
          &lt;span class=&quot;n&quot;&gt;insn&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;instructions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;(Unfortunately, this sidesteps handling the less-fun bit of calls in ABIs where
after the 6th parameter, they are expected on the stack. It also completely
ignores ABI size constraints.)&lt;/p&gt;

&lt;p&gt;Now, you may have noticed that we don’t do anything special for the incoming
params of the function we’re compiling! That’s another thing we have to handle.
Thankfully, we can handle it with yet another parallel move (wow!) at the end
of &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;resolve_ssa&lt;/code&gt;.&lt;/p&gt;

&lt;div class=&quot;language-ruby highlighter-rouge&quot;&gt;&lt;div class=&quot;highlight&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;class&lt;/span&gt; &lt;span class=&quot;nc&quot;&gt;Function&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;def&lt;/span&gt; &lt;span class=&quot;nf&quot;&gt;resolve_ssa&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;intervals&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;assignments&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# ...&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# We&apos;re typically going to have more param regs than block parameters&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# When we zip the param regs with block params, we&apos;ll end up with param&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# regs mapping to nil. We filter those away by selecting for tuples&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# that have a truthy second value&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# [[x, y], [z, nil]].select(&amp;amp;:last) (reject the second tuple)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;mapping&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;param_regs&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;entry_block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;parameters&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;amp;&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:last&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;to_h&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sequentialize&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mapping&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;|&lt;/span&gt;
      &lt;span class=&quot;no&quot;&gt;Insn&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;new&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;ss&quot;&gt;:mov&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dst&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;entry_block&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;insert_moves_at_start&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sequence&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;/div&gt;

&lt;p&gt;Again, this is yet another kind of thing where some of the later papers have
much better ergonomics and also much better generated code.&lt;/p&gt;

&lt;p&gt;But this is really cool! If you have arrived at this point with me, we have
successfully made it to 1997 and that is nothing to sneeze at. We have even
adapted research from 1997 to work with SSA, avoiding several significant
classes of bugs along the way.&lt;/p&gt;

&lt;!--
## Instruction selection and instruction splitting

## Interval splitting

## Register hints

What is this iterated linear scan thing? Appears in JSC

while (true) {
  linearscan();
  if (!would_spill) { break; }
  interval = pick_an_interval_to_spill();
  spill(interval);
  remove_interval(interval);
}
--&gt;

&lt;h2 id=&quot;validation-by-abstract-interpretation&quot;&gt;Validation by abstract interpretation&lt;/h2&gt;

&lt;p&gt;We have just built an enormously complex machine. Even out the gate, with the
original linear scan, there is a lot of machinery. It’s possible to write tests
that spot check sample programs of all shapes and sizes but it’s &lt;em&gt;very&lt;/em&gt;
difficult to anticipate every possible edge case that will appear in the real
world.&lt;/p&gt;

&lt;p&gt;Even if the original algorithm you’re using has been proven correct, your
implementation may have subtle bugs due to (for example) having slightly
different invariants or even transcription errors.&lt;/p&gt;

&lt;p&gt;We have all these proof tools at our disposal: we can write an abstract
interpreter that verifies properties of &lt;em&gt;one&lt;/em&gt; graph, but it’s very hard
(impossible?) to scale that to sets of graphs.&lt;/p&gt;

&lt;p&gt;Maybe that’s enough, though. In one of my favorite blog posts, Chris Fallin
&lt;a href=&quot;https://cfallin.org/blog/2021/03/15/cranelift-isel-3/&quot;&gt;writes about&lt;/a&gt; writing a
register allocation verifier based on abstract interpretation. It can verify
one concrete LIR function at a time. It’s fast enough that it can be left on in
debug builds. This means that a decent chunk of the time (tests, CI, maybe a
production cluster) we can get a very clear signal that every register
assignment that passes through the verifier satisfies some invariants.&lt;/p&gt;

&lt;p&gt;Furthermore, we are not limited to Real World Code. With the advent of fuzzing,
one can imagine an always-on fuzzer that tries to break the register allocator.
A verifier can then catch bugs that come from exploring this huge search space.&lt;/p&gt;

&lt;p&gt;Some time after finding Chris’s blog post, I also stumbled across &lt;a href=&quot;https://github.com/v8/v8/blob/cac6de03372c25987c6cbea49b4b39d9da437978/src/compiler/backend/register-allocator-verifier.h&quot;&gt;the very same
thing in
V8&lt;/a&gt;!&lt;/p&gt;

&lt;p&gt;I find this stuff so cool. I’ll also mention Boissinot’s &lt;a href=&quot;https://github.com/bboissin/thesis_bboissin&quot;&gt;Rust
code&lt;/a&gt; again because it does
something similar for parallel moves.&lt;/p&gt;

&lt;h2 id=&quot;see-also&quot;&gt;See also&lt;/h2&gt;

&lt;p&gt;It’s possible to do linear scan allocation in reverse, at least on traces
without control-flow. See for example &lt;a href=&quot;https://www.mattkeeter.com/blog/2022-10-04-ssra/&quot;&gt;The Solid-State Register
Allocator&lt;/a&gt;, the &lt;a href=&quot;https://github.com/LuaJIT/LuaJIT/blob/5e3c45c43bb0e0f1f2917d432e9d2dba12c42a6e/src/lj_asm.c#L198&quot;&gt;LuaJIT
register allocator&lt;/a&gt;, and &lt;a href=&quot;https://brrt-to-the-future.blogspot.com/2019/03/reverse-linear-scan-allocation-is.html&quot;&gt;Reverse Linear Scan Allocation is
probably a good idea&lt;/a&gt;.
By doing linear scan this way, it is also possible to avoid computing liveness
and intervals. I am not sure if this works on programs with control-flow,
though.&lt;/p&gt;

&lt;h2 id=&quot;wrapping-up&quot;&gt;Wrapping up&lt;/h2&gt;

&lt;p&gt;We built a register allocator that works on SSA. Hopefully next time we will
add features such as lifetime holes, interval splitting, and register hints.&lt;/p&gt;

&lt;p&gt;The full Ruby code listing is &lt;del&gt;not (yet?) public&lt;/del&gt; &lt;a href=&quot;https://github.com/tenderworks/regalloc&quot;&gt;available under the Apache
2 license&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;UPDATE: See the post on &lt;a href=&quot;/blog/linear-scan-lifetime-holes/&quot;&gt;lifetime holes&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;thanks&quot;&gt;Thanks&lt;/h2&gt;

&lt;p&gt;Thanks to &lt;a href=&quot;https://waleedkhan.name/&quot;&gt;Waleed Khan&lt;/a&gt; and &lt;a href=&quot;https://mstdn.ca/@iainireland&quot;&gt;Iain
Ireland&lt;/a&gt; for giving feedback on this post.&lt;/p&gt;
&lt;div class=&quot;footnotes&quot; role=&quot;doc-endnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:calendaring&quot;&gt;
      &lt;p&gt;It’s not just about registers, either. In 2016, Facebook
engineer Dave &lt;a href=&quot;https://blog.waleedkhan.name/will-i-ever-use-this/&quot;&gt;legendarily used linear-scan register allocation to book
meeting rooms&lt;/a&gt;. &lt;a href=&quot;#fnref:calendaring&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:everyone&quot;&gt;
      &lt;p&gt;Well. As I said on one of the social media sites earlier this
year, “All AOT compilers are alike; each JIT compiler is fucked up in its
own way.”&lt;/p&gt;

      &lt;p&gt;JavaScript:&lt;/p&gt;

      &lt;!-- * V8&apos;s Maglev uses --&gt;
      &lt;ul&gt;
        &lt;li&gt;V8’s TurboFan uses &lt;a href=&quot;https://github.com/v8/v8/blob/12fa27f2f4d999320c524776ed29810c8694bafc/src/compiler/backend/register-allocator.h#L1545&quot;&gt;linear scan&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;SpiderMonkey uses a &lt;a href=&quot;https://searchfox.org/mozilla-central/rev/c85c168374483a3c37aab49d7f587ea74a516492/js/src/jit/BacktrackingAllocator.h#28-31&quot;&gt;backtracking allocator&lt;/a&gt;
based on &lt;a href=&quot;https://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html&quot;&gt;LLVM’s&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;JavaScriptCore uses &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/f5a9393bdeff7c89685de21aa9f2df392139cc07/Source/JavaScriptCore/b3/air/AirAllocateRegistersAndStackByLinearScan.h#L37&quot;&gt;linear scan “for
optLevel&amp;lt;2”&lt;/a&gt;
and &lt;a href=&quot;https://github.com/WebKit/WebKit/blob/f5a9393bdeff7c89685de21aa9f2df392139cc07/Source/JavaScriptCore/b3/air/AirAllocateRegistersByGraphColoring.h&quot;&gt;graph coloring otherwise&lt;/a&gt;
          &lt;ul&gt;
            &lt;li&gt;There’s also a “cssjit” with its own register allocator…&lt;/li&gt;
          &lt;/ul&gt;
        &lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;Java:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;HotSpot C1 uses (naturally) &lt;a href=&quot;https://github.com/openjdk/jdk/blob/87d734012e3130501bfd37b23cee7f5e0a3a476f/src/hotspot/share/c1/c1_LinearScan.hpp&quot;&gt;Wimmer2010 linear scan&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;HotSpot C2 uses &lt;a href=&quot;https://github.com/openjdk/jdk/blob/87d734012e3130501bfd37b23cee7f5e0a3a476f/src/hotspot/share/opto/regalloc.hpp&quot;&gt;Chaitin-Briggs-Click graph coloring&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;GraalVM uses &lt;a href=&quot;https://github.com/oracle/graal/blob/e482f988939235ce94ee4a756c6bcc1d3df2bab2/compiler/src/jdk.graal.compiler/src/jdk/graal/compiler/lir/alloc/lsra/LinearScan.java&quot;&gt;linear scan&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;
          &lt;!-- Dalvik, ART --&gt;
        &lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;Python:&lt;/p&gt;

      &lt;!-- * PyPy uses --&gt;
      &lt;ul&gt;
        &lt;li&gt;Cinder uses &lt;a href=&quot;https://github.com/facebookincubator/cinderx/blob/5cf14ad8a68b6f04c1ca1cb99947da7d8d09c28b/cinderx/Jit/lir/regalloc.h&quot;&gt;Wimmer2010 linear scan&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;S6 uses a &lt;a href=&quot;https://github.com/google-deepmind/s6/blob/69cac9c981fbd3217ed117c3898382cfe094efc0/src/code_generation/trace_register_allocator.h&quot;&gt;trace register allocator&lt;/a&gt;
          &lt;ul&gt;
            &lt;li&gt;This is a different thing than a tracing JIT; see &lt;a href=&quot;/assets/img/trace-ra.pdf&quot;&gt;Josef Eisl’s thesis&lt;/a&gt; (PDF, 2018)&lt;/li&gt;
          &lt;/ul&gt;
        &lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;Ruby:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;YJIT uses &lt;a href=&quot;https://github.com/ruby/ruby/blob/231407c251d82573f578caf569a934c0ebb344e5/yjit/src/backend/ir.rs#L1388&quot;&gt;linear scan&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;ZJIT uses more or less the same backend, so also linear scan&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;PHP:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;PHP uses &lt;a href=&quot;https://github.com/php/php-src/blob/77dace78c324ef731e60fa98b4b8008cd7df1657/ext/opcache/jit/ir/ir_ra.c#L3479&quot;&gt;linear scan&lt;/a&gt;&lt;/li&gt;
        &lt;li&gt;HHVM uses &lt;a href=&quot;https://github.com/facebook/hhvm/blob/e7bca518648e16bdb7c08e91d02f8c158d8e6c6f/hphp/runtime/vm/jit/vasm-xls.cpp#L1448&quot;&gt;extended linear scan&lt;/a&gt;&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;Lua:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;LuaJIT uses &lt;a href=&quot;https://github.com/LuaJIT/LuaJIT/blob/5e3c45c43bb0e0f1f2917d432e9d2dba12c42a6e/src/lj_asm.c#L198&quot;&gt;reverse linear scan&lt;/a&gt;&lt;/li&gt;
      &lt;/ul&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:everyone&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:allocate-on-ssa&quot;&gt;
      &lt;p&gt;&lt;a href=&quot;/assets/img/linear-scan-ra-context-ssa.pdf&quot;&gt;Linear Scan Register Allocation in the Context of SSA Form
and Register Constraints&lt;/a&gt; (PDF, 2002) by Mössenböck and
Pfeiffer notes:&lt;/p&gt;

      &lt;blockquote&gt;
        &lt;p&gt;Our allocator relies on static single assignment form, which simplifies
data flow analysis and tends to produce short live intervals.&lt;/p&gt;
      &lt;/blockquote&gt;

      &lt;p&gt;&lt;a href=&quot;/assets/img/ra-programs-ssa.pdf&quot;&gt;Register allocation for programs in SSA-form&lt;/a&gt; (PDF, 2006)
by Hack, Grund, and Goos notes that interference graphs for SSA programs
are chordal and can be optimally colored in quadratic time.&lt;/p&gt;

      &lt;p&gt;&lt;a href=&quot;/assets/img/ssa-elimination-after-ra.pdf&quot;&gt;SSA Elimination after Register Allocation&lt;/a&gt; (PDF, 2008)
by Pereira and Palsberg notes:&lt;/p&gt;

      &lt;blockquote&gt;
        &lt;p&gt;One of the main advantages of SSA based register allocation is the
separation of phases between spilling and register assignment.&lt;/p&gt;
      &lt;/blockquote&gt;

      &lt;p&gt;Cliff Click (private communication, 2025) notes:&lt;/p&gt;

      &lt;blockquote&gt;
        &lt;p&gt;It’s easier. Got it already, why lose it […] spilling always uses
use/def and def/use edges.&lt;/p&gt;
      &lt;/blockquote&gt;
      &lt;p&gt;&lt;a href=&quot;#fnref:allocate-on-ssa&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:rsms&quot;&gt;
      &lt;p&gt;This is inspired by &lt;a href=&quot;https://rsms.me/&quot;&gt;Rasmus Andersson&lt;/a&gt;’s graph
coloring &lt;a href=&quot;https://rsms.me/projects/chaitin/&quot;&gt;visualization&lt;/a&gt; that I saw some
years ago. &lt;a href=&quot;#fnref:rsms&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:thesis-correction&quot;&gt;
      &lt;p&gt;The example in the thesis is to sequentialize the
following parallel copy:&lt;/p&gt;

      &lt;ul&gt;
        &lt;li&gt;a → b&lt;/li&gt;
        &lt;li&gt;b → c&lt;/li&gt;
        &lt;li&gt;c → a&lt;/li&gt;
        &lt;li&gt;c → d&lt;/li&gt;
      &lt;/ul&gt;

      &lt;p&gt;The solution in the thesis is:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;c → d (c now lives in d)&lt;/li&gt;
        &lt;li&gt;a → c (a now lives in c)&lt;/li&gt;
        &lt;li&gt;b → a (b now lives in a)&lt;/li&gt;
        &lt;li&gt;d → b (why are we copying c to b?)&lt;/li&gt;
      &lt;/ol&gt;

      &lt;p&gt;but we think this is incorrect. Solving manually, Aaron and I got:&lt;/p&gt;

      &lt;ol&gt;
        &lt;li&gt;c → d (because d is not read from anywhere)&lt;/li&gt;
        &lt;li&gt;b → c (because c is “freed up”; now in d)&lt;/li&gt;
        &lt;li&gt;a → b (because b is “freed up”; now in c)&lt;/li&gt;
        &lt;li&gt;d → a (because c is now in d, so d → a is equivalent to old_c → a)&lt;/li&gt;
      &lt;/ol&gt;

      &lt;p&gt;which is what the code gives us, too. &lt;a href=&quot;#fnref:thesis-correction&quot; class=&quot;reversefootnote&quot; role=&quot;doc-backlink&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</description>
            <pubDate>Wed, 13 Aug 2025 00:00:00 +0000</pubDate>
            <niceDate>August 13, 2025</niceDate>
            <link>https://bernsteinbear.com/blog/linear-scan/?utm_source=rss</link>
            <guid isPermaLink="true">https://bernsteinbear.com/blog/linear-scan/</guid>
        </item>
        
    </channel>
</rss>
