<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
    <title>
        Disruptive Ninja | Vincent Bernat
    </title>
    <link href="http://vincent.bernat.im/en/blog/atom.xml" rel="self" />
    <link href="http://vincent.bernat.im/en"/>
    <id>http://www.luffy.cx/en/blog/atom.xml/</id>

            <updated>2013-02-21T00:04:56+01:00</updated>

            <entry>
            <title type="html">lldpd 0.7.1</title>
            <author><name>Vincent Bernat</name></author>
            <link href="http://vincent.bernat.im/en/blog/2013-lldpd-0.7.1.html"/>
            <updated>2013-02-20T09:28:19+01:00</updated>
            <id>http://www.luffy.cx/en/blog/2013-lldpd-0.7.1.html</id>

            <content type="html">
<![CDATA[
<p>A few weeks ago, a new version of <a href="http://vincentbernat.github.com/lldpd" title="lldpd, an implementation of 802.1AB">lldpd</a>, a 802.<span class="caps">1AB</span> (aka <abbr title="Link Layer Discovery Protocol"><span class="caps">LLDP</span></abbr>)
implementation for various Unices, has been&nbsp;released.</p>
<p><a href="http://en.wikipedia.org/wiki/Link_Layer_Discovery_Protocol" title="Link Layer Discovery Protocol"><abbr title="Link Layer Discovery Protocol"><abbr title="Link Layer Discovery Protocol"><span class="caps">LLDP</span></abbr></abbr></a> is an industry standard protocol designed to supplant
proprietary Link-Layer protocols such as <span class="caps">EDP</span> or <span class="caps">CDP</span>. The goal of <abbr title="Link Layer Discovery Protocol"><span class="caps">LLDP</span></abbr>
is to provide an inter-vendor compatible mechanism to deliver
Link-Layer notifications to adjacent network&nbsp;devices.</p>
<p>In short, <abbr title="Link Layer Discovery Protocol"><span class="caps">LLDP</span></abbr> allows you to know exactly on which port is a server
(and reciprocally). To illustrate its use, I have made a <a href="http://xkcd.com" title="xkcd">xkcd</a>-like&nbsp;strip:</p>
<p><img alt="xkcd-like strip for the use of LLDP" src="//d1g3mdmxf8zbo9.cloudfront.net/images/why-lldp.png" title="Why use LLDP?"></p>
<p>If you would like more information about <em>lldpd</em>, please have a look
at <a href="http://vincentbernat.github.com/lldpd" title="lldpd, an implementation of 802.1AB">its new dedicated website</a>. This blog post is an insight of
various <strong>technical changes</strong> that have affected <em>lldpd</em> since its
latest major release one year ago. Lots of C stuff&nbsp;ahead!</p>
<div class="toc">
<ul>
<li><a href="#version-changelog">Version <span class="amp">&amp;</span> changelog</a><ul>
<li><a href="#automated-version">Automated version</a></li>
<li><a href="#automated-changelog">Automated changelog</a></li>
</ul>
</li>
<li><a href="#core">Core</a><ul>
<li><a href="#c99">C99</a></li>
<li><a href="#logging">Logging</a></li>
<li><a href="#libevent">libevent</a></li>
</ul>
</li>
<li><a href="#client">Client</a><ul>
<li><a href="#serialization">Serialization</a></li>
<li><a href="#library">Library</a></li>
<li><a href="#cli"><span class="caps">CLI</span></a><ul>
<li><a href="#parsing-completion">Parsing <span class="amp">&amp;</span> completion</a></li>
<li><a href="#readline">Readline</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#os-specific-support"><span class="caps">OS</span> specific support</a><ul>
<li><a href="#netlink-on-linux">Netlink on Linux</a></li>
<li><a href="#bsd-support"><span class="caps">BSD</span> support</a></li>
<li><a href="#os-x-support"><span class="caps">OS</span> X support</a></li>
<li><a href="#upstart-and-systemd-support">Upstart and systemd support</a></li>
<li><a href="#os-include-files"><span class="caps">OS</span> include files</a></li>
</ul>
</li>
</ul>
</div>
<h1 id="version-changelog">Version <span class="amp">&amp;</span> changelog</h1>
<p><strong><span class="caps">UPDATED</span>:</strong> <a href="http://www.hadrons.org/~guillem/" title="Guillem Jover's Home Page">Guillem Jover</a> told me how he met the same goals for
<a href="http://libbsd.freedesktop.org/wiki/" title="libbsd">libbsd</a> :</p>
<ol>
<li>Save the version from git into <code>.dist-version</code> and use this file
    if it exists. This allows one to rebuild <code>./configure</code> from the
    published tarball without losing the version. This also handles
    <a href="https://www.mirbsd.org/permalinks/wlog-10_e20130220-tg.htm#e20130220-tg_wlog-10" title="GNU autotools generated files">Thorsten Glaser&#8217;s critic</a>.</li>
<li>Include <code>CHANGELOG</code> in <code>DISTCLEANFILES</code> variable.</li>
</ol>
<p>Since this is a better solution, I have <a href="https://github.com/vincentbernat/lldpd/commit/a888bea6f08687177330c2d95569864009e769d6" title="build: use the same way as libbsd for version and changelog">adopted</a> the appropriate
line of codes from <em>libbsd</em>. The two following sections are partly
technically&nbsp;outdated.</p>
<h2 id="automated-version">Automated version</h2>
<p>In <code>configure.ac</code>, I was previously using a static version number that
I had to increase when&nbsp;releasing:</p>
<div class="codehilite"><pre><span class="err">AC_INIT([lldpd], [0.5.7], [bernat@luffy.cx])</span>
</pre></div>


<p>Since the information is present in the git tree, this seems a bit
redundant (and easy to forget). Taking the version from the git tree
is&nbsp;easy:</p>
<div class="codehilite"><pre><span class="err">AC_INIT([lldpd],</span>
<span class="err">        [m4_esyscmd_s([git describe --tags --always --match [0-9]* 2&gt; /dev/null || date +%F])],</span>
<span class="err">        [bernat@luffy.cx])</span>
</pre></div>


<p>If the head of the git tree is tagged, you get the exact tag (<code>0.7.1</code>
for example). If it is not, you get the nearest one, the number of
commits since it and part of the current hash (<code>0.7.1-29-g2909519</code> for&nbsp;example).</p>
<p>The drawback of this approach is that if you rebuild <code>configure</code>
from the released tarball, you don&#8217;t have the git tree and the version
will be a date. Just don&#8217;t do&nbsp;that.</p>
<h2 id="automated-changelog">Automated changelog</h2>
<p>Generating the changelog from git is a common practice. I had some
difficulties to make it right. Here is my attempt (I am using
<code>automake</code>):</p>
<div class="codehilite"><pre><span class="nv">dist_doc_DATA</span> <span class="o">=</span> <span class="caps">README</span>.md <span class="caps">NEWS</span> ChangeLog

.<span class="caps">PHONY</span>: <span class="k">$(</span>distdir<span class="k">)</span>/ChangeLog
dist-hook: <span class="k">$(</span>distdir<span class="k">)</span>/ChangeLog
<span class="k">$(</span>distdir<span class="k">)</span>/ChangeLog:
        <span class="k">$(</span>AM_V_GEN<span class="k">)if </span><span class="nb">test</span> -d <span class="k">$(</span>top_srcdir<span class="k">)</span>/.git; <span class="k">then</span> <span class="se">\</span>
          <span class="nv">prev</span><span class="o">=</span><span class="nv">$$</span><span class="o">(</span>git describe --tags --always --match <span class="o">[</span>0-9<span class="o">]</span>* 2&gt; /dev/null<span class="o">)</span> ; <span class="se">\</span>
          <span class="k">for </span>tag in <span class="nv">$$</span><span class="o">(</span>git tag | grep -E <span class="s1">&#39;^[0-9]+(\.[0-9]+){1,}$$&#39;</span> | sort -rn<span class="o">)</span>; <span class="k">do</span> <span class="se">\</span>
            <span class="k">if</span> <span class="o">[</span> x<span class="s2">&quot;$$prev&quot;</span> <span class="o">=</span> x <span class="o">]</span>; <span class="k">then </span><span class="nv">prev</span><span class="o">=</span><span class="nv">$$</span>tag ; <span class="k">fi</span> ; <span class="se">\</span>
            <span class="k">if</span> <span class="o">[</span> x<span class="s2">&quot;$$prev&quot;</span> <span class="o">=</span> x<span class="s2">&quot;$$tag&quot;</span> <span class="o">]</span>; <span class="k">then continue</span>; <span class="k">fi</span> ; <span class="se">\</span>
            <span class="nb">echo</span> <span class="s2">&quot;$$prev [$$(git log $$prev -1 --pretty=format:&#39;%ai&#39;)]:&quot;</span> ; <span class="se">\</span>
            <span class="nb">echo</span> <span class="s2">&quot;&quot;</span> ; <span class="se">\</span>
            git log --pretty<span class="o">=</span><span class="s1">&#39; - [%h] %s (%an)&#39;</span> <span class="nv">$$</span>tag..<span class="nv">$$</span>prev ; <span class="se">\</span>
            <span class="nb">echo</span> <span class="s2">&quot;&quot;</span> ; <span class="se">\</span>
            <span class="nv">prev</span><span class="o">=</span><span class="nv">$$</span>tag ; <span class="se">\</span>
          <span class="k">done</span> &gt; <span class="nv">$@</span> ; <span class="se">\</span>
<span class="cp">        else \</span>
<span class="cp">          touch $@ ; \</span>
<span class="cp">        fi</span>
ChangeLog:
        touch <span class="nv">$@</span>
</pre></div>


<p>Changelog entries are grouped by version. Since it is a bit verbose, I
still maintain a <code>NEWS</code> file with important&nbsp;changes.</p>
<h1 id="core">Core</h1>
<h2 id="c99">C99</h2>
<p>I have recently read <a href="http://oreilly.com/shop/product/0636920025108.html?bB=g" title="21st Century C book on O'Reilly">21st Century C</a> which has some good bits and
also handles the ecosystem around C. I have definitively adopted
<a href="http://gcc.gnu.org/onlinedocs/gcc/Designated-Inits.html" title="Designated initializers in GCC documentation">designated initializers</a> in my coding style. Being a <span class="caps">GCC</span> extension
since a long time, this is not a major compatibility&nbsp;problem.</p>
<p>Without designated&nbsp;initializers:</p>
<div class="codehilite"><pre><span class="k">struct</span> <span class="n">netlink_req</span> <span class="n">req</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">iovec</span> <span class="n">iov</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">sockaddr_nl</span> <span class="n">peer</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">msghdr</span> <span class="n">rtnl_msg</span><span class="p">;</span>

<span class="n">memset</span><span class="p">(</span><span class="o">&amp;</span><span class="n">req</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">req</span><span class="p">));</span>
<span class="n">memset</span><span class="p">(</span><span class="o">&amp;</span><span class="n">iov</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">iov</span><span class="p">));</span>
<span class="n">memset</span><span class="p">(</span><span class="o">&amp;</span><span class="n">peer</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">peer</span><span class="p">));</span>
<span class="n">memset</span><span class="p">(</span><span class="o">&amp;</span><span class="n">rtnl_msg</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">rtnl_msg</span><span class="p">));</span>

<span class="n">req</span><span class="p">.</span><span class="n">hdr</span><span class="p">.</span><span class="n">nlmsg_len</span> <span class="o">=</span> <span class="n">NLMSG_LENGTH</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="k">struct</span> <span class="n">rtgenmsg</span><span class="p">));</span>
<span class="n">req</span><span class="p">.</span><span class="n">hdr</span><span class="p">.</span><span class="n">nlmsg_type</span> <span class="o">=</span> <span class="n">RTM_GETLINK</span><span class="p">;</span>
<span class="n">req</span><span class="p">.</span><span class="n">hdr</span><span class="p">.</span><span class="n">nlmsg_flags</span> <span class="o">=</span> <span class="n">NLM_F_REQUEST</span> <span class="o">|</span> <span class="n">NLM_F_DUMP</span><span class="p">;</span>
<span class="n">req</span><span class="p">.</span><span class="n">hdr</span><span class="p">.</span><span class="n">nlmsg_seq</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">req</span><span class="p">.</span><span class="n">hdr</span><span class="p">.</span><span class="n">nlmsg_pid</span> <span class="o">=</span> <span class="n">getpid</span><span class="p">();</span>
<span class="n">req</span><span class="p">.</span><span class="n">gen</span><span class="p">.</span><span class="n">rtgen_family</span> <span class="o">=</span> <span class="n">AF_PACKET</span><span class="p">;</span>
<span class="n">iov</span><span class="p">.</span><span class="n">iov_base</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">req</span><span class="p">;</span>
<span class="n">iov</span><span class="p">.</span><span class="n">iov_len</span> <span class="o">=</span> <span class="n">req</span><span class="p">.</span><span class="n">hdr</span><span class="p">.</span><span class="n">nlmsg_len</span><span class="p">;</span>
<span class="n">peer</span><span class="p">.</span><span class="n">nl_family</span> <span class="o">=</span> <span class="n">AF_NETLINK</span><span class="p">;</span>
<span class="n">rtnl_msg</span><span class="p">.</span><span class="n">msg_iov</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">iov</span><span class="p">;</span>
<span class="n">rtnl_msg</span><span class="p">.</span><span class="n">msg_iovlen</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">rtnl_msg</span><span class="p">.</span><span class="n">msg_name</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">peer</span><span class="p">;</span>
<span class="n">rtnl_msg</span><span class="p">.</span><span class="n">msg_namelen</span> <span class="o">=</span> <span class="k">sizeof</span><span class="p">(</span><span class="k">struct</span> <span class="n">sockaddr_nl</span><span class="p">);</span>
</pre></div>


<p>With designated&nbsp;initializers:</p>
<div class="codehilite"><pre><span class="k">struct</span> <span class="n">netlink_req</span> <span class="n">req</span> <span class="o">=</span> <span class="p">{</span>
    <span class="p">.</span><span class="n">hdr</span> <span class="o">=</span> <span class="p">{</span>
        <span class="p">.</span><span class="n">nlmsg_len</span> <span class="o">=</span> <span class="n">NLMSG_LENGTH</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="k">struct</span> <span class="n">rtgenmsg</span><span class="p">)),</span>
        <span class="p">.</span><span class="n">nlmsg_type</span> <span class="o">=</span> <span class="n">RTM_GETLINK</span><span class="p">,</span>
        <span class="p">.</span><span class="n">nlmsg_flags</span> <span class="o">=</span> <span class="n">NLM_F_REQUEST</span> <span class="o">|</span> <span class="n">NLM_F_DUMP</span><span class="p">,</span>
        <span class="p">.</span><span class="n">nlmsg_seq</span> <span class="o">=</span> <span class="mi">1</span><span class="p">,</span>
        <span class="p">.</span><span class="n">nlmsg_pid</span> <span class="o">=</span> <span class="n">getpid</span><span class="p">()</span> <span class="p">},</span>
    <span class="p">.</span><span class="n">gen</span> <span class="o">=</span> <span class="p">{</span> <span class="p">.</span><span class="n">rtgen_family</span> <span class="o">=</span> <span class="n">AF_PACKET</span> <span class="p">}</span>
<span class="p">};</span>
<span class="k">struct</span> <span class="n">iovec</span> <span class="n">iov</span> <span class="o">=</span> <span class="p">{</span>
    <span class="p">.</span><span class="n">iov_base</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">req</span><span class="p">,</span>
    <span class="p">.</span><span class="n">iov_len</span> <span class="o">=</span> <span class="n">req</span><span class="p">.</span><span class="n">hdr</span><span class="p">.</span><span class="n">nlmsg_len</span>
<span class="p">};</span>
<span class="k">struct</span> <span class="n">sockaddr_nl</span> <span class="n">peer</span> <span class="o">=</span> <span class="p">{</span> <span class="p">.</span><span class="n">nl_family</span> <span class="o">=</span> <span class="n">AF_NETLINK</span> <span class="p">};</span>
<span class="k">struct</span> <span class="n">msghdr</span> <span class="n">rtnl_msg</span> <span class="o">=</span> <span class="p">{</span>
    <span class="p">.</span><span class="n">msg_iov</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">iov</span><span class="p">,</span>
    <span class="p">.</span><span class="n">msg_iovlen</span> <span class="o">=</span> <span class="mi">1</span><span class="p">,</span>
    <span class="p">.</span><span class="n">msg_name</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">peer</span><span class="p">,</span>
    <span class="p">.</span><span class="n">msg_namelen</span> <span class="o">=</span> <span class="k">sizeof</span><span class="p">(</span><span class="k">struct</span> <span class="n">sockaddr_nl</span><span class="p">)</span>
<span class="p">};</span>
</pre></div>


<h2 id="logging">Logging</h2>
<p>Logging in <em>lldpd</em> was not extensive. Usually, when receiving a bug
report, I asked the reporter to add some additional <code>printf()</code> calls
to determine where the problem was. This was clearly
suboptimal. Therefore, I have added many <code>log_debug()</code> calls with the
ability to filter out some of them. For example, to debug interface
discovery, one can run <em>lldpd</em> with <code>lldpd -ddd -D interface</code>.</p>
<p>Moreover, I have added colors when logging to a terminal. This may
seem pointless but it is now far easier to spot warning messages from
debug&nbsp;ones.</p>
<p><img alt="logging output of lldpd" src="//d1g3mdmxf8zbo9.cloudfront.net/images/lldpd-logging.png" title="Example of colored logging output for lldpd"></p>
<h2 id="libevent">libevent</h2>
<p>In <em>lldpd</em> 0.5.7, I was using my own <code>select()</code>-based event loop. It
worked but I didn&#8217;t want to grow a full-featured event loop inside
<em>lldpd</em>. Therefore, I switched to <a href="http://libevent.org/" title="libevent, an event notification library">libevent</a>.</p>
<p>The minimal required version of <em>libevent</em> is 2.0.5. A convenient way
to check the changes in <abbr title="Application Programming Interface"><span class="caps">API</span></abbr> is to use <a href="http://upstream-tracker.org/versions/libevent.html" title="API compatibility report for libevent library">Upstream Tracker</a>, a website
tracking <abbr title="Application Programming Interface"><span class="caps">API</span></abbr> and <span class="caps">ABI</span> changes for various libraries. This version of
<em>libevent</em> is not available in many stable distributions. For example,
Debian Squeeze or Ubuntu Lucid only have
1.4.13. I am also trying to keep compatibility with very old
distributions, like <span class="caps">RHEL</span> 2, which does not have a packaged <em>libevent</em>
at&nbsp;all.</p>
<p>For some users, it may be a burden to compile additional
libraries. Therefore, I have included <em>libevent</em> source code in
<em>lldpd</em> source tree (as a git submodule) and I am only using it if no
suitable system <em>libevent</em> is&nbsp;available.</p>
<p>Have a look at <a href="https://github.com/vincentbernat/lldpd/blob/4c1a8c6152215b9c1320e04f6c811404f27f53c8/m4/libevent.m4"><code>m4/libevent.m4</code></a> and
<a href="https://github.com/vincentbernat/lldpd/blob/0.7.1/src/daemon/Makefile.am"><code>src/daemon/Makefile.am</code></a> to see how this is&nbsp;done.</p>
<h1 id="client">Client</h1>
<h2 id="serialization">Serialization</h2>
<p><code>lldpctl</code> is a client querying <code>lldpd</code> to display discovered
neighbors. The communication is done through an Unix socket. Each
structure to be serialized over this socket should be described with a
string. For&nbsp;example:</p>
<div class="codehilite"><pre><span class="cp">#define STRUCT_LLDPD_DOT3_MACPHY &quot;(bbww)&quot;</span>
<span class="k">struct</span> <span class="n">lldpd_dot3_macphy</span> <span class="p">{</span>
        <span class="n">u_int8_t</span>                 <span class="n">autoneg_support</span><span class="p">;</span>
        <span class="n">u_int8_t</span>                 <span class="n">autoneg_enabled</span><span class="p">;</span>
        <span class="n">u_int16_t</span>                <span class="n">autoneg_advertised</span><span class="p">;</span>
        <span class="n">u_int16_t</span>                <span class="n">mau_type</span><span class="p">;</span>
<span class="p">};</span>
</pre></div>


<p>I did not want to use stuff like <a href="http://code.google.com/p/protobuf/" title="Protocol Buffers - Google data interchange format">Protocol Buffers</a> because I didn&#8217;t
want to copy the existing structures to other structures before
serialization (and the other way after&nbsp;deserialization).</p>
<p>However, the serializer in <em>lldpd</em> did not allow to handle reference
to other structures, lists or circular references. I have written
another one which works by annotating a structure with some&nbsp;macros:</p>
<div class="codehilite"><pre><span class="k">struct</span> <span class="n">lldpd_chassis</span> <span class="p">{</span>
    <span class="n">TAILQ_ENTRY</span><span class="p">(</span><span class="n">lldpd_chassis</span><span class="p">)</span> <span class="n">c_entries</span><span class="p">;</span>
    <span class="n">u_int16_t</span>        <span class="n">c_index</span><span class="p">;</span>
    <span class="n">u_int8_t</span>         <span class="n">c_protocol</span><span class="p">;</span>
    <span class="n">u_int8_t</span>         <span class="n">c_id_subtype</span><span class="p">;</span>
    <span class="kt">char</span>            <span class="o">*</span><span class="n">c_id</span><span class="p">;</span>
    <span class="kt">int</span>              <span class="n">c_id_len</span><span class="p">;</span>
    <span class="kt">char</span>            <span class="o">*</span><span class="n">c_name</span><span class="p">;</span>
    <span class="kt">char</span>            <span class="o">*</span><span class="n">c_descr</span><span class="p">;</span>

    <span class="n">u_int16_t</span>        <span class="n">c_cap_available</span><span class="p">;</span>
    <span class="n">u_int16_t</span>        <span class="n">c_cap_enabled</span><span class="p">;</span>

    <span class="n">u_int16_t</span>        <span class="n">c_ttl</span><span class="p">;</span>

    <span class="n">TAILQ_HEAD</span><span class="p">(,</span> <span class="n">lldpd_mgmt</span><span class="p">)</span> <span class="n">c_mgmt</span><span class="p">;</span>
<span class="p">};</span>
<span class="n">MARSHAL_BEGIN</span><span class="p">(</span><span class="n">lldpd_chassis</span><span class="p">)</span>
<span class="n">MARSHAL_TQE</span>  <span class="p">(</span><span class="n">lldpd_chassis</span><span class="p">,</span> <span class="n">c_entries</span><span class="p">)</span>
<span class="n">MARSHAL_FSTR</span> <span class="p">(</span><span class="n">lldpd_chassis</span><span class="p">,</span> <span class="n">c_id</span><span class="p">,</span> <span class="n">c_id_len</span><span class="p">)</span>
<span class="n">MARSHAL_STR</span>  <span class="p">(</span><span class="n">lldpd_chassis</span><span class="p">,</span> <span class="n">c_name</span><span class="p">)</span>
<span class="n">MARSHAL_STR</span>  <span class="p">(</span><span class="n">lldpd_chassis</span><span class="p">,</span> <span class="n">c_descr</span><span class="p">)</span>
<span class="n">MARSHAL_SUBTQ</span><span class="p">(</span><span class="n">lldpd_chassis</span><span class="p">,</span> <span class="n">lldpd_mgmt</span><span class="p">,</span> <span class="n">c_mgmt</span><span class="p">)</span>
<span class="n">MARSHAL_END</span><span class="p">;</span>
</pre></div>


<p>Only pointers need to be annotated. The remaining of the structure can
be serialized with just <code>memcpy()</code><sup id="fnref:uint16t"><a href="#fn:uint16t" rel="footnote">1</a></sup>. I think there is still
room for improvement. It should be possible to add annotations inside
the structure and avoid some duplication. Or maybe, using a
<a href="https://bitbucket.org/eliben/pycparser" title="pycparser, a parser for C language written in pure Python">C parser</a>? Or using the <span class="caps">AST</span> output from&nbsp;<span class="caps">LLVM</span>?</p>
<h2 id="library">Library</h2>
<p>In <em>lldpd</em> 0.5.7, there are two possible entry points to interact
with the&nbsp;daemon:</p>
<ol>
<li>Through <span class="caps">SNMP</span> support. Only information available in <a href="http://www.ieee802.org/1/files/public/MIBs/LLDP-MIB-200505060000Z.txt" title="LLDP-MIB"><abbr title="Link Layer Discovery Protocol"><abbr title="Link Layer Discovery Protocol"><span class="caps">LLDP</span></abbr></abbr>-<span class="caps">MIB</span></a>
    are exported. Therefore, implementation-specific values are
    not available. Moreover, <span class="caps">SNMP</span> support is currently&nbsp;read-only.</li>
<li>Through <code>lldpctl</code>. Thanks to a contribution from Andreas
    Hofmeister, the output can be requested to be formatted as an <span class="caps">XML</span>&nbsp;document.</li>
</ol>
<p>Integration of <em>lldpd</em> into a network stack was therefore limited to one
of those two channels. As an exemple, you can have a look at
<a href="http://git.vyatta.com/git/?p=vyatta-lldp.git;a=summary" title="Integration of lldpd in Vyatta">how Vyatta made the integration</a> using the second&nbsp;solution.</p>
<p>To provide a more robust solution, I have added a shared library,
<code>liblldpctl</code>, with a stable and well-defined <abbr title="Application Programming Interface"><span class="caps">API</span></abbr>. <code>lldpctl</code> is now
using it. I have followed those directions<sup id="fnref:library"><a href="#fn:library" rel="footnote">2</a></sup>:</p>
<ul>
<li>Consistent naming (all exported symbols are prefixed by
   <code>lldpctl_</code>). No pollution of the global&nbsp;namespace.</li>
<li>Consistent return codes (on errors, all functions returning
   pointers are returning <code>NULL</code>, all functions returning integers are
   returning <code>-1</code>).</li>
<li>Reentrant and thread-safe. No global&nbsp;variables.</li>
<li>One well-documented <a href="https://github.com/vincentbernat/lldpd/blob/0.7.1/src/lib/lldpctl.h" title="lldpctl.h">include file</a>.</li>
<li>Reduce the use of boilerplate code. Don&#8217;t segfault on <code>NULL</code>,
   accept integer input as string, provide easy&nbsp;iterators,&nbsp;&#8230;</li>
<li>Asynchronous <abbr title="Application Programming Interface"><span class="caps">API</span></abbr> for input/output. The library delegates reading
   and writing by calling user-provided functions. Those functions can
   yield their effects. In this case, the user has to callback the
   library when data is available for reading or writing. It is
   therefore possible to integrate the library with any existing
   event-loop. A thin synchronous layer is provided on top of this
   <abbr title="Application Programming Interface"><span class="caps">API</span></abbr>.</li>
<li>Opaque types with accessor&nbsp;functions.</li>
</ul>
<p>Accessing bits of information is done through &#8220;atoms&#8221; which are opaque
containers of type <code>lldpctl_atom_t</code>. From an atom, you can extract
some properties as integers, strings, buffers or other atoms. The list
of ports is an atom. A port in this list is also an atom. The list of
<span class="caps">VLAN</span> present on this port is an atom, as well as each <span class="caps">VLAN</span> in this
list. The <span class="caps">VLAN</span> name is a <code>NULL</code>-terminated string living in the scope
of an atom. Accessing a property is done by a handful of functions,
like <code>lldpctl_atom_get_str()</code>, using a specific key. For example, here
is how to display the list of <span class="caps">VLAN</span> assuming you have one port as an&nbsp;atom:</p>
<div class="codehilite"><pre><span class="n">vlans</span> <span class="o">=</span> <span class="n">lldpctl_atom_get</span><span class="p">(</span><span class="n">port</span><span class="p">,</span> <span class="n">lldpctl_k_port_vlans</span><span class="p">);</span>
<span class="n">lldpctl_atom_foreach</span><span class="p">(</span><span class="n">vlans</span><span class="p">,</span> <span class="n">vlan</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">vid</span> <span class="o">=</span> <span class="n">lldpctl_atom_get_int</span><span class="p">(</span><span class="n">vlan</span><span class="p">,</span>
                               <span class="n">lldpctl_k_vlan_id</span><span class="p">));</span>
    <span class="n">name</span> <span class="o">=</span> <span class="n">lldpctl_atom_get_str</span><span class="p">(</span><span class="n">vlan</span><span class="p">,</span>
                                <span class="n">lldpctl_k_vlan_name</span><span class="p">));</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">vid</span> <span class="o">&amp;&amp;</span> <span class="n">name</span><span class="p">)</span>
        <span class="n">printf</span><span class="p">(</span><span class="s">&quot;<span class="caps">VLAN</span> %d: %s</span><span class="se">\n</span><span class="s">&quot;</span><span class="p">,</span> <span class="n">vid</span><span class="p">,</span> <span class="n">name</span><span class="p">);</span>
<span class="p">}</span>
<span class="n">lldpctl_atom_dec_ref</span><span class="p">(</span><span class="n">vlans</span><span class="p">);</span>
</pre></div>


<p>Internally, an atom is typed and reference counted. The size of the
<abbr title="Application Programming Interface"><span class="caps">API</span></abbr> is greatly limited thanks to this concept. There are currently
more than one hundred pieces of information that can be retrieved from
<code>lldpd</code>.</p>
<p>Ultimately, the library will also enable the full configuration of
<code>lldpd</code>. Currently, many aspects can only be configured through
command-line flags. The use of the library does not replace
<code>lldpctl</code> which will still be available and be the primary client of
the&nbsp;library.</p>
<h2 id="cli"><span class="caps">CLI</span></h2>
<p>Having a configuration file was requested since a long time. I didn&#8217;t
want to include a parser in <code>lldpd</code>: I am trying to keep it small. It
was already possible to configure <code>lldpd</code> through
<code>lldpctl</code>. Locations, network policies and power policies were the
three items that could be configured this way. So, the next step was
to enable <code>lldpctl</code> to read a configuration file, parse it and send
the result to <code>lldpd</code>. As a bonus, why not provide a full <span class="caps">CLI</span>
accepting the same statements with inline help and&nbsp;completion?</p>
<h3 id="parsing-completion">Parsing <span class="amp">&amp;</span> completion</h3>
<p>Because of completion, it is difficult to use a <span class="caps">YACC</span> generated
parser. Instead, I define a tree where each node accepts a word. A
node is defined with this&nbsp;function:</p>
<div class="codehilite"><pre><span class="k">struct</span> <span class="n">cmd_node</span> <span class="o">*</span><span class="n">commands_new</span><span class="p">(</span>
    <span class="k">struct</span> <span class="n">cmd_node</span> <span class="o">*</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="p">,</span>
    <span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="p">,</span>
    <span class="kt">int</span><span class="p">(</span><span class="o">*</span><span class="n">validate</span><span class="p">)(</span><span class="k">struct</span> <span class="n">cmd_env</span><span class="o">*</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="p">),</span>
    <span class="kt">int</span><span class="p">(</span><span class="o">*</span><span class="n">execute</span><span class="p">)(</span><span class="k">struct</span> <span class="n">lldpctl_conn_t</span><span class="o">*</span><span class="p">,</span> <span class="k">struct</span> <span class="n">writer</span><span class="o">*</span><span class="p">,</span>
        <span class="k">struct</span> <span class="n">cmd_env</span><span class="o">*</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="p">),</span>
    <span class="kt">void</span> <span class="o">*</span><span class="p">);</span>
</pre></div>


<p>A node is defined&nbsp;by:</p>
<ul>
<li>its&nbsp;parent,</li>
<li>an optional accepted static&nbsp;token,</li>
<li>an help&nbsp;string,</li>
<li>an optional validation function and</li>
<li>an optional function to execute if the current token is&nbsp;accepted.</li>
</ul>
<p>When walking the tree, we maintain an environment which is both a
key-value store and a stack of positions in the tree. The validation
function can check the environment to see if we are in the right
context (we want to accept the keyword <code>foo</code> only once, for
example). The execution function can add the current token as a value
in the environment but it can also pop the current position in the
tree to resume walk from a previous&nbsp;node.</p>
<p>As an example, see how nodes for configuration of a coordinate-based
location are&nbsp;registered:</p>
<div class="codehilite"><pre><span class="cm">/* Our root node */</span>
<span class="k">struct</span> <span class="n">cmd_node</span> <span class="o">*</span><span class="n">configure_medloc_coord</span> <span class="o">=</span> <span class="n">commands_new</span><span class="p">(</span>
    <span class="n">configure_medlocation</span><span class="p">,</span>
    <span class="s">&quot;coordinate&quot;</span><span class="p">,</span> <span class="s">&quot;<span class="caps">MED</span> location coordinate configuration&quot;</span><span class="p">,</span>
    <span class="nb"><span class="caps">NULL</span></span><span class="p">,</span> <span class="nb"><span class="caps">NULL</span></span><span class="p">,</span> <span class="nb"><span class="caps">NULL</span></span><span class="p">);</span>

<span class="cm">/* The exit node.</span>
<span class="cm">   The validate function will check if we have both</span>
<span class="cm">   latitude and longitude. */</span>
<span class="n">commands_new</span><span class="p">(</span><span class="n">configure_medloc_coord</span><span class="p">,</span>
    <span class="n"><span class="caps">NEWLINE</span></span><span class="p">,</span> <span class="s">&quot;Configure <span class="caps">MED</span> location coordinates&quot;</span><span class="p">,</span>
    <span class="n">cmd_check_env</span><span class="p">,</span> <span class="n">cmd_medlocation_coordinate</span><span class="p">,</span>
    <span class="s">&quot;latitude,longitude&quot;</span><span class="p">);</span>

<span class="cm">/* Store latitude. Once stored, we pop two positions</span>
<span class="cm">   to go back to the &quot;root&quot; node. The user can only</span>
<span class="cm">   enter latitude once. */</span>
<span class="n">commands_new</span><span class="p">(</span>
    <span class="n">commands_new</span><span class="p">(</span>
        <span class="n">configure_medloc_coord</span><span class="p">,</span>
        <span class="s">&quot;latitude&quot;</span><span class="p">,</span> <span class="s">&quot;Specify latitude&quot;</span><span class="p">,</span>
        <span class="n">cmd_check_no_env</span><span class="p">,</span> <span class="nb"><span class="caps">NULL</span></span><span class="p">,</span> <span class="s">&quot;latitude&quot;</span><span class="p">),</span>
    <span class="nb"><span class="caps">NULL</span></span><span class="p">,</span> <span class="s">&quot;Latitude as xx.yyyyN or xx.yyyyS&quot;</span><span class="p">,</span>
    <span class="nb"><span class="caps">NULL</span></span><span class="p">,</span> <span class="n">cmd_store_env_value_and_pop2</span><span class="p">,</span> <span class="s">&quot;latitude&quot;</span><span class="p">);</span>

<span class="cm">/* Same thing for longitude */</span>
<span class="n">commands_new</span><span class="p">(</span>
    <span class="n">commands_new</span><span class="p">(</span>
        <span class="n">configure_medloc_coord</span><span class="p">,</span>
        <span class="s">&quot;longitude&quot;</span><span class="p">,</span> <span class="s">&quot;Specify longitude&quot;</span><span class="p">,</span>
        <span class="n">cmd_check_no_env</span><span class="p">,</span> <span class="nb"><span class="caps">NULL</span></span><span class="p">,</span> <span class="s">&quot;longitude&quot;</span><span class="p">),</span>
    <span class="nb"><span class="caps">NULL</span></span><span class="p">,</span> <span class="s">&quot;Longitude as xx.yyyyE or xx.yyyyW&quot;</span><span class="p">,</span>
    <span class="nb"><span class="caps">NULL</span></span><span class="p">,</span> <span class="n">cmd_store_env_value_and_pop2</span><span class="p">,</span> <span class="s">&quot;longitude&quot;</span><span class="p">);</span>
</pre></div>


<p>The definition of all commands is still a bit verbose but the system
is simple enough yet powerful enough to cover all needed&nbsp;cases.</p>
<h3 id="readline">Readline</h3>
<p>When faced with a <span class="caps">CLI</span>, we usually expect some perks like completion,
history handling and help. The most used library to provide such
features is the <a href="http://www.gnu.org/software/readline/" title="The GNU Readline Library"><span class="caps">GNU</span> Readline Library</a>. Because this is a <abbr title="GNU General Public License"><span class="caps">GPL</span></abbr>
library, I have first searched an alternative. There are several of&nbsp;them:</p>
<ul>
<li><a href="http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libedit/" title="NetBSD libedit">NetBSD Editline library</a> (<code>libedit</code>).</li>
<li><a href="http://thrysoee.dk/editline/" title="Autotool port of libedit">Autotool port of the NetBSD Editline library</a> (also <code>libedit</code>).</li>
<li><a href="http://packages.qa.debian.org/e/editline.html">Debian version of the Editline library</a> (<code>libeditline</code>).</li>
<li><a href="https://github.com/antirez/linenoise" title="linenoise, a minimal readline replacement">linenoise</a>, a small and minimal readline&nbsp;library.</li>
<li>Many&nbsp;others.</li>
</ul>
<p>From an <abbr title="Application Programming Interface"><span class="caps">API</span></abbr> point of view, the first three libraries support the <em><span class="caps">GNU</span>
Readline</em> <abbr title="Application Programming Interface"><span class="caps">API</span></abbr>. They also have a common native <abbr title="Application Programming Interface"><span class="caps">API</span></abbr>. Moreover, this
native <abbr title="Application Programming Interface"><span class="caps">API</span></abbr> also handles tokenization. Therefore, I have developed the
first version of the <span class="caps">CLI</span> with this <abbr title="Application Programming Interface"><span class="caps">API</span></abbr><sup id="fnref:libeditapi"><a href="#fn:libeditapi" rel="footnote">3</a></sup>.</p>
<p>Unfortunately, I noticed later this library is not very common in the
Linux world and is not available in <span class="caps">RHEL</span>. Since I have used the native
<abbr title="Application Programming Interface"><span class="caps">API</span></abbr>, it was not possible to fallback to the <em><span class="caps">GNU</span> Readline</em>
library. So, let&#8217;s switch! Thanks to the
<a href="http://www.gnu.org/software/autoconf-archive/ax_lib_readline.html" title="ax_lib_readline in autoconf archive">appropriate macro from the Autoconf Archive</a> (with
small modifications), the compilation and linking differences between
the libraries are taken care&nbsp;of.</p>
<p>Because <em><span class="caps">GNU</span> Readline</em> library does not come with a tokenizer, I had
to write one myself. The <abbr title="Application Programming Interface"><span class="caps">API</span></abbr> is also badly documented and it is
difficult to know which symbol is available in which version. I have
limited myself&nbsp;to:</p>
<ul>
<li><code>readline()</code>, <code>addhistory()</code>,</li>
<li><code>rl_insert_text()</code>,</li>
<li><code>rl_forced_update_display()</code>,</li>
<li><code>rl_bind_key()</code></li>
<li><code>rl_line_buffer</code> and <code>rl_point</code>.</li>
</ul>
<p>Unfortunately, the various <code>libedit</code> libraries have a noop for
<code>rl_bind_key()</code>. Therefore, completion and online help is not
available with them. I have noticed that most <abbr title="Berkeley Software Distribution"><span class="caps">BSD</span></abbr> come with <em><span class="caps">GNU</span>
Readline</em> library preinstalled, so it could be considered as a system
library. Nonetheless, linking with <code>libedit</code> to avoid licensing issues
is possible and help can be obtained by prefixing the command with
<code>help</code>.</p>
<h1 id="os-specific-support"><span class="caps">OS</span> specific support</h1>
<h2 id="netlink-on-linux">Netlink on Linux</h2>
<p>Previously, the list of interfaces was retrieved through
<code>getifaddrs()</code>. <em>lldpd</em> is now using directly <a href="http://en.wikipedia.org/wiki/Netlink" title="Netlink on Wikipedia">Netlink</a> on
Linux. This is not a big change since the <em><span class="caps">GNU</span> C Library</em> already uses
it to implement <code>getifaddrs()</code> and additional information, like <span class="caps">VLAN</span>,
are still retrieved through <code>ioctl()</code> or <em>sysfs</em>. However, <em>lldpd</em>
now gets notified when a change happens and update all interfaces in
the next&nbsp;second.</p>
<p>Like many other projects, I have written my own <em>Netlink</em> implementation
instead of using <a href="http://www.infradead.org/~tgr/libnl/" title="Netlink Protocol Library Suite">libnl</a>, a nice collection of libraries providing
everything you need to query the kernel through <em>Netlink</em>, including
some advanced bits.&nbsp;Why?</p>
<ol>
<li>
<p>The latest version of <a href="http://www.infradead.org/~tgr/libnl/" title="Netlink Protocol Library Suite">libnl</a> is still young and its
    availability in major distributions is scarce. It is not available
    in Debian Squeeze but will be available in Debian Wheezy. Like
    <em>libevent</em>, I could circumvent this problem by shipping the
    library with <em>lldpd</em> and use it when there is not system
    alternative.&nbsp;But&#8230;</p>
</li>
<li>
<p><em>libnl</em> is licensed under <abbr title="GNU Lesser General Public License"><span class="caps">LGPL</span></abbr> 2.1. This makes static linking
    difficult because the license is quite shaddy about static linking
    being derivative work or not. It is believed that it is authorized
    under the same provisions as in <abbr title="GNU Lesser General Public License"><span class="caps">LGPL</span></abbr> 3 which handles the case
    explicitely. This has been a problem with many projects. For
    example, <a href="&quot;http://www.ogre3d.org/&quot;" title="OGRE: Open Source 3D Graphics Engine"><span class="caps">OGRE</span></a> has added <a href="http://www.ogre3d.org/2009/03/06/lgpl-exclusions-added-static-linking-now-simpler" title="OGRE: static linking now simpler">an exception for static linking</a> in
    version 1.6 and <a href="http://www.ogre3d.org/2009/09/15/ogre-will-switch-to-the-mit-license-from-1-7" title="OGRE will switch to the MIT license from 1.7">switched to <span class="caps">MIT</span> license</a> in version&nbsp;1.7.</p>
</li>
</ol>
<p>I had a short discussion with Thomas Graf about this issue and he
seems willing to add a similar exception. This may take some time, but
once this is done, I will happily switch to <em>libnl</em> and retrieve more
stuff from <em>Netlink</em>.</p>
<h2 id="bsd-support"><abbr title="Berkeley Software Distribution"><span class="caps">BSD</span></abbr> support</h2>
<p>Until version 0.7, <em>lldpd</em> was Linux-only. The rewrite to use
<em>Netlink</em> was the occasion to abstract interfaces and to port to other
<span class="caps">OS</span>. The first port was for <a href="http://www.debian.org/ports/kfreebsd-gnu/" title="Debian GNU/kFreeBSD">Debian <span class="caps">GNU</span>/kFreeBSD</a>, then for
<a href="http://www.freebsd.org" title="FreeBSD">FreeBSD</a>, <a href="http:/www.openbsd.org" title="OpenBSD">OpenBSD</a> and <a href="http://www.netbsd.org" title="NetBSD">NetBSD</a>. They all share the same
source&nbsp;code:</p>
<ul>
<li><code>getifaddrs()</code> to get the list of&nbsp;interfaces,</li>
<li><code>bpf(4)</code> to attach to an interface to receive and send&nbsp;packets,</li>
<li><code>PF_ROUTE</code> socket to be notified when a change&nbsp;happens.</li>
</ul>
<p>Each <abbr title="Berkeley Software Distribution"><span class="caps">BSD</span></abbr> has its own <code>ioctl()</code> to retrieve <span class="caps">VLAN</span>, bridging and bonding
bits but they are quite similar. The code was usually adapted from
<code>ifconfig.c</code>.</p>
<p>The <abbr title="Berkeley Software Distribution"><span class="caps">BSD</span></abbr> ports have the same functionalities than the Linux port,
except for <em>NetBSD</em> which lacks support for <abbr title="Link Layer Discovery Protocol"><span class="caps">LLDP</span></abbr>-<span class="caps">MED</span> inventory since I
didn&#8217;t find a simple way to retrieve <span class="caps">DMI</span> related&nbsp;information.</p>
<p>They also offer greater security by filtering packets sent. Moreover,
<em>OpenBSD</em> allows to lock the filters set on the&nbsp;socket:</p>
<div class="codehilite"><pre><span class="cm">/* Install write filter (optional) */</span>
<span class="k">if</span> <span class="p">(</span><span class="n">ioctl</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n"><span class="caps">BIOCSETWF</span></span><span class="p">,</span> <span class="p">(</span><span class="n">caddr_t</span><span class="p">)</span><span class="o">&amp;</span><span class="n">fprog</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">rc</span> <span class="o">=</span> <span class="n">errno</span><span class="p">;</span>
    <span class="n">log_info</span><span class="p">(</span><span class="s">&quot;privsep&quot;</span><span class="p">,</span> <span class="s">&quot;unable to setup write <span class="caps">BPF</span> filter for %s&quot;</span><span class="p">,</span>
        <span class="n">name</span><span class="p">);</span>
    <span class="k">goto</span> <span class="n">end</span><span class="p">;</span>
<span class="p">}</span>

<span class="cm">/* Lock interface */</span>
<span class="k">if</span> <span class="p">(</span><span class="n">ioctl</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="n"><span class="caps">BIOCLOCK</span></span><span class="p">,</span> <span class="p">(</span><span class="n">caddr_t</span><span class="p">)</span><span class="o">&amp;</span><span class="n">enable</span><span class="p">)</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">rc</span> <span class="o">=</span> <span class="n">errno</span><span class="p">;</span>
    <span class="n">log_info</span><span class="p">(</span><span class="s">&quot;privsep&quot;</span><span class="p">,</span> <span class="s">&quot;unable to lock <span class="caps">BPF</span> interface %s&quot;</span><span class="p">,</span>
        <span class="n">name</span><span class="p">);</span>
    <span class="k">goto</span> <span class="n">end</span><span class="p">;</span>
<span class="p">}</span>
</pre></div>


<p>This is a very nice feature. <em>lldpd</em> is using a privileged process to
open the raw socket. The socket is then transmitted to an unprivileged
process. Without this feature, the unprivileged process can remove the
<span class="caps">BPF</span> filters. I have ported the
<a href="http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=commit;h=d59577b6ffd313d0ab3be39cb1ab47e29bdc9182" title="Abilitry to lock a socket filter program">ability to lock a socket filter program</a> to Linux. However, I still
have to add a write&nbsp;filter.</p>
<h2 id="os-x-support"><span class="caps">OS</span> X support</h2>
<p>Once FreeBSD was supported, supporting <span class="caps">OS</span> X seemed easy. I got
sponsored by <a href="http://xcloud.me/" title="Xcloud - Mac cloud server">xcloud.me</a> which provided a virtual Mac server. Making
<em>lldpd</em> work with <span class="caps">OS</span> X took only two days, including a full hour to
guess how to get Apple Xcode without providing a credit&nbsp;card.</p>
<p>To help people installing <em>lldpd</em> on <span class="caps">OS</span> X, I have also written a
<a href="https://github.com/mxcl/homebrew/pull/17052" title="Formula to install lldpd through Homebrew"><em>lldpd</em> formula</a> for <a href="http://mxcl.github.com/homebrew/" title="Homebrew: the missing package manager for OS X">Homebrew</a> which seems to be the most
popular package manager for <span class="caps">OS</span>&nbsp;X.</p>
<h2 id="upstart-and-systemd-support">Upstart and systemd support</h2>
<p>Many distributions propose <a href="http://upstart.ubuntu.com/" title="upstart, an event-based init daemon">upstart</a> and <a href="http://www.freedesktop.org/wiki/Software/systemd" title="systemd, system and service manager">systemd</a> as a
replacement or an alternative for the classic SysV init. Like most
daemons, <em>lldpd</em> detaches itself from the terminal and run in the
background, by forking twice, once it is ready (for <em>lldpd</em>, this just
means we have setup the control socket). While both <em>upstart</em> and
<em>systemd</em> can accommodate daemons that behave like this, it is
recommended to not fork. How to advertise readiness in this&nbsp;case?</p>
<p>With <em>upstart</em>, <em>lldpd</em> will send itself the <code>SIGSTOP</code>
signal. <em>upstart</em> will detect this, resume <em>lldpd</em> with <code>SIGCONT</code> and
assume it is ready. The code to support <em>upstart</em> is therefore quite
simple. Instead of calling <code>daemon()</code>, do&nbsp;this:</p>
<div class="codehilite"><pre><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">upstartjob</span> <span class="o">=</span> <span class="n">getenv</span><span class="p">(</span><span class="s">&quot;UPSTART_JOB&quot;</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="p">(</span><span class="n">upstartjob</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="n">strcmp</span><span class="p">(</span><span class="n">upstartjob</span><span class="p">,</span> <span class="s">&quot;lldpd&quot;</span><span class="p">)))</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">log_debug</span><span class="p">(</span><span class="s">&quot;main&quot;</span><span class="p">,</span> <span class="s">&quot;running with upstart, don&#39;t fork but stop&quot;</span><span class="p">);</span>
<span class="n">raise</span><span class="p">(</span><span class="n"><span class="caps">SIGSTOP</span></span><span class="p">);</span>
</pre></div>


<p>The job configuration file looks like&nbsp;this:</p>
<div class="codehilite"><pre><span class="c"># lldpd - <span class="caps">LLDP</span> daemon</span>

<span class="n">description</span> <span class="s">&quot;<span class="caps">LLDP</span> daemon&quot;</span>

<span class="n">start</span> <span class="n">on</span> <span class="n">net</span><span class="o">-</span><span class="n">device</span><span class="o">-</span><span class="n">up</span> <span class="n"><span class="caps">IFACE</span></span><span class="p">=</span><span class="n">lo</span>
<span class="n">stop</span> <span class="n">on</span> <span class="n">runlevel</span> <span class="p">[</span>06<span class="p">]</span>

<span class="n">expect</span> <span class="n">stop</span>
<span class="n">respawn</span>

<span class="n">script</span>
  <span class="p">.</span> <span class="o">/</span><span class="n">etc</span><span class="o">/</span><span class="n">default</span><span class="o">/</span><span class="n">lldpd</span>
  <span class="nb">exec</span> <span class="n">lldpd</span> $<span class="n">DAEMON_ARGS</span>
<span class="k">end</span> <span class="n">script</span>
</pre></div>


<p><em>systemd</em> provides a socket to achieve the same goal. An application
is expected to write <code>READY=1</code> to the socket when it is ready. With
the provided library, this is just a matter of calling
<code>sd_notify("READY=1\n")</code>. Since <code>sd_notify()</code> has less than 30 lines
of code, I have rewritten it to avoid an external dependency. The
appropriate unit file&nbsp;is:</p>
<div class="codehilite"><pre><span class="k">[Unit]</span>
<span class="na">Description</span><span class="o">=</span><span class="s"><span class="caps">LLDP</span> daemon</span>
<span class="na">Documentation</span><span class="o">=</span><span class="s">man:lldpd(8)</span>

<span class="k">[Service]</span>
<span class="na">Type</span><span class="o">=</span><span class="s">notify</span>
<span class="na">NotifyAccess</span><span class="o">=</span><span class="s">main</span>
<span class="na">EnvironmentFile</span><span class="o">=</span><span class="s">-/etc/default/lldpd</span>
<span class="na">ExecStart</span><span class="o">=</span><span class="s">/usr/sbin/lldpd $DAEMON_ARGS</span>
<span class="na">Restart</span><span class="o">=</span><span class="s">on-failure</span>

<span class="k">[Install]</span>
<span class="na">WantedBy</span><span class="o">=</span><span class="s">multi-user.target</span>
</pre></div>


<h2 id="os-include-files"><span class="caps">OS</span> include files</h2>
<p>Linux-specific include files were a major pain in previous versions of
<em>lldpd</em>. The problems range from missing header files (like
<code>linux/if_bonding.h</code>) to the use of kernel-only types. Those headers
have a difficult history. They were first shipped with the C library
but were rarely synced and almost always outdated. They were then
extracted from kernel version with almost no change and lagged behind
the kernel version used by the released distribution<sup id="fnref:sarge"><a href="#fn:sarge" rel="footnote">4</a></sup>.</p>
<p>Today, the problem is acknowledged and is being solved by both the
distributions which extract the headers from the packaged kernel and
by kernel developers with a
<a href="http://lwn.net/Articles/507794/" title="The UAPI header file split">separation of kernel-only headers from user-space <abbr title="Application Programming Interface"><abbr title="Application Programming Interface"><span class="caps">API</span></abbr></abbr> headers</a>. However,
we still need to handle&nbsp;legacy.</p>
<p>A good case is <code>linux/ethtool.h</code>:</p>
<ul>
<li>It can just be&nbsp;absent.</li>
<li>It can use <code>u8</code>, <code>u16</code> types which are kernel-only types. To work
   around this issue, <a href="https://github.com/vincentbernat/lldpd/blob/0.5.7/m4/ethtool.m4" title="Type munging for linux/ethtool.h">type munging</a> can be&nbsp;setup.</li>
<li>It can miss some definition, like <code>SPEED_10000</code>. In this case, you
   either define the missing bits and find yourself with a long copy
   of the original header interleaved with <code>#ifdef</code> or conditionally
   use each symbol. The latest solution is a burden by itself but it
   also hinders some functionalities that can be available in the
   running&nbsp;kernel.</li>
</ul>
<p>The easy solution to all this mess is to just include the appropriate
kernel headers into the source tree of the project. Thanks to Google
ripping them for its Bionic C library, we know that
<a href="http://lwn.net/Articles/434318/" title="Has Bionic stepped over the GPL line?">copying kernel headers into a program does not create a derivative work</a>.</p>
<div class="footnote">
<hr>
<ol>
<li id="fn:uint16t">
<p>Therefore, the use of <code>u_int16_t</code> and <code>u_int8_t</code> types is
        a left-over of the previous serializer where the size of
        all members was important.&#160;<a href="#fnref:uint16t" rev="footnote" title="Jump back to footnote 1 in the text">&#8617;</a></p>
</li>
<li id="fn:library">
<p>For more comprehensive guidelines, be sure to check
        <a href="http://davidz25.blogspot.fr/2011/07/writing-c-library-intro-conclusion-and.html" title="Writing a C library">Writing a C library</a>.&#160;<a href="#fnref:library" rev="footnote" title="Jump back to footnote 2 in the text">&#8617;</a></p>
</li>
<li id="fn:libeditapi">
<p>Tokenization is not the only advantage of <code>libedit</code>
           native <abbr title="Application Programming Interface"><abbr title="Application Programming Interface"><abbr title="Application Programming Interface"><span class="caps">API</span></abbr></abbr></abbr>. The <abbr title="Application Programming Interface"><abbr title="Application Programming Interface"><abbr title="Application Programming Interface"><span class="caps">API</span></abbr></abbr></abbr> is also cleaner, does not have a
           global state and has a better documentation. All the
           implementations are also <abbr title="Berkeley Software Distribution"><abbr title="Berkeley Software Distribution"><abbr title="Berkeley Software Distribution"><span class="caps">BSD</span></abbr></abbr></abbr> licensed.&#160;<a href="#fnref:libeditapi" rev="footnote" title="Jump back to footnote 3 in the text">&#8617;</a></p>
</li>
<li id="fn:sarge">
<p>For example, in Debian Sarge, the Linux kernel was a 2.6.8
      (2004) while the kernel headers were extracted from some
      pre-2.6 kernel.&#160;<a href="#fnref:sarge" rev="footnote" title="Jump back to footnote 4 in the text">&#8617;</a></p>
</li>
</ol>
</div>
]]>
            </content>
        </entry>
                <entry>
            <title type="html">Network virtualization with VXLAN</title>
            <author><name>Vincent Bernat</name></author>
            <link href="http://vincent.bernat.im/en/blog/2012-multicast-vxlan.html"/>
            <updated>2012-11-02T18:07:58+01:00</updated>
            <id>http://www.luffy.cx/en/blog/2012-multicast-vxlan.html</id>

            <content type="html">
<![CDATA[
<p><em>Virtual eXtensible Local Area Network</em> (<abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr>) is a protocol to
overlay a virtualized L2 network over an existing <span class="caps">IP</span> network with
little setup. It is currently described in an
<a href="http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-02" title="VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over">Internet-Draft</a>. It adds the following perks to VLANs
while still providing&nbsp;isolation:</p>
<ol>
<li>It uses a 24-bit <em><abbr title="Virtual eXtensible Local Area Network"><abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr></abbr> Network Identifier</em> (<abbr title="VXLAN Network Identifier"><span class="caps">VNI</span></abbr>) which should be
    enough to address any scale-based concerns of&nbsp;multitenancy.</li>
<li>It wraps L2 frames into <span class="caps">UDP</span> datagrams. This allows one to rely on
    some interesting properties of <span class="caps">IP</span> networks like availability and
    scalability. A <abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr> segment can be extended far beyond the
    typical reach of today&nbsp;VLANs.</li>
</ol>
<p>The <em><abbr title="Virtual eXtensible Local Area Network"><abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr></abbr> Tunnel End Point</em> (<abbr title="VXLAN Tunnel End Point"><span class="caps">VTEP</span></abbr>) originates and terminates <abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr>
tunnels. Thanks to a serie of patches from Stephen Hemminger,
<a href="http://lwn.net/Articles/518292/" title="vxlan: virtual extensible lan">Linux can now act as a <abbr title="VXLAN Tunnel End Point"><abbr title="VXLAN Tunnel End Point"><span class="caps">VTEP</span></abbr></abbr></a>. Let&#8217;s see how this&nbsp;works.</p>
<div class="toc">
<ul>
<li><a href="#about-ipv6">About IPv6</a></li>
<li><a href="#lab">Lab</a><ul>
<li><a href="#unicast-routing">Unicast routing</a></li>
<li><a href="#multicast-routing">Multicast routing</a><ul>
<li><a href="#igmp"><span class="caps">IGMP</span></a></li>
<li><a href="#pim-sm"><span class="caps">PIM</span>-<span class="caps">SM</span></a></li>
</ul>
</li>
<li><a href="#setting-up-vxlan">Setting up <span class="caps">VXLAN</span></a></li>
</ul>
</li>
<li><a href="#demo">Demo</a></li>
</ul>
</div>
<h1 id="about-ipv6">About IPv6</h1>
<p>When possible, I try to use IPv6 for my labs. This is not the case
here for several&nbsp;reasons:</p>
<ol>
<li><span class="caps">IP</span> multicast is required and <abbr title="Protocol Independant Multicast - Sparse Mode"><span class="caps">PIM</span>-<span class="caps">SM</span></abbr> implementations for IPv6 are
    not widespread yet. However, they exist. This explains why I use
    <a href="http://www.xorp.org/" title="XORP: Extensible open source routing platform"><span class="caps">XORP</span></a> for this lab: it supports <abbr title="Protocol Independant Multicast - Sparse Mode"><span class="caps">PIM</span>-<span class="caps">SM</span></abbr> for both IPv4 and&nbsp;IPv6.</li>
<li><a href="http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-02" title="VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over"><abbr title="Virtual eXtensible Local Area Network"><abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr></abbr> Internet-Draft</a> specifically addresses only
    IPv4. This seems a bit odd for a protocol running on top of <span class="caps">UDP</span>
    and I hope this will be fixed soon. This is not a major
    stopper since
    <a href="https://github.com/upa/vxlan/" title="VXLAN implementation using Linux tap interfaces">some <abbr title="Virtual eXtensible Local Area Network"><abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr></abbr> implementations support IPv6</a>.</li>
<li>However, the current implementation for Linux does not support
    IPv6. IPv6 support will be <a href="http://www.spinics.net/lists/netdev/msg214956.html" title="Plan for adding IPv6 support for VXLAN in Linux">added later</a>.</li>
</ol>
<p>Once IPv6 support is available, the lab should be easy to&nbsp;adapt.</p>
<p><strong><span class="caps">UPDATED</span>:</strong> The <a href="http://tools.ietf.org/html/draft-mahalingam-dutt-dcops-vxlan-03" title="VXLAN: A Framework for Overlaying Virtualized Layer 2 Networks over (Draft 03)">latest draft</a> addresses IPv6 support. It is
currently being added to Linux&nbsp;implementation.</p>
<h1 id="lab">Lab</h1>
<p>So, here is the lab used. <code>R1</code>, <code>R2</code> and <code>R3</code> will act as VTEPs. They
do not make use of <abbr title="Protocol Independant Multicast - Sparse Mode"><span class="caps">PIM</span>-<span class="caps">SM</span></abbr>. Instead, they have a generic multicast route
on <code>eth0</code>. <code>E1</code>, <code>E2</code> and <code>E3</code> are edge routers while <code>C1</code>, <code>C2</code> and
<code>C3</code> are core routers. The proposed lab is not resilient but
convenient to explain how things work. It is built on top of <abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr>
hosts. Have a look at my <a href="/en/blog/2012-network-lab-kvm.html" title="Network lab with KVM">previous article</a> for more details on&nbsp;this.</p>
<p><img alt="VXLAN lab" src="//d1g3mdmxf8zbo9.cloudfront.net/images/vxlan/lab.png" title="Topology of VXLAN lab"></p>
<p>The lab is hosted on <a href="https://github.com/vincentbernat/network-lab/tree/master/lab-vxlan" title="VXLAN lab">GitHub</a>. I have made the lab easier to try by
including the kernel I have used for my tests. <span class="caps">XORP</span> comes
preconfigured, you just have to configure the <abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr> part. For this,
you need a recent version of <code>ip</code>.</p>
<div class="codehilite"><pre><span class="gp">$</span> sudo apt-get install screen vde2 kvm iproute xorp git
<span class="gp">$</span> git clone git://git.kernel.org/pub/scm/linux/kernel/git/shemminger/iproute2.git
<span class="gp">$</span> <span class="nb">cd </span>iproute2
<span class="gp">$</span> ./configure <span class="o">&amp;&amp;</span> make
<span class="go">You get `ip&#39; as `ip/ip&#39; and `bridge&#39; as `bridge/bridge&#39;.</span>
<span class="gp">$</span> <span class="nb">cd</span> ..
<span class="gp">$</span> git clone git://github.com/vincentbernat/network-lab.git
<span class="gp">$</span> <span class="nb">cd </span>network-lab/lab-vxlan
<span class="gp">$</span> ./setup
</pre></div>


<h2 id="unicast-routing">Unicast routing</h2>
<p>The first step is to setup unicast routing. <abbr title="Open Shortest Path First"><span class="caps">OSPF</span></abbr> is used for this
purpose. The chosen routing daemon is <a href="http://www.xorp.org/" title="XORP: Extensible open source routing platform"><span class="caps">XORP</span></a>. With <code>xorpsh</code>, we can
check if <abbr title="Open Shortest Path First"><span class="caps">OSPF</span></abbr> is working as&nbsp;expected:</p>
<div class="codehilite"><pre><span class="gp">root@c1#</span> xorpsh
<span class="gp">root@c1$</span> show ospf4 neighbor   
<span class="go">  Address         Interface             State      <span class="caps">ID</span>              Pri  Dead</span>
<span class="go">192.168.11.11    eth0/eth0              Full      3.0.0.1          128    36</span>
<span class="go">192.168.12.22    eth1/eth1              Full      3.0.0.2          128    33</span>
<span class="go">192.168.101.133  eth2/eth2              Full      2.0.0.3          128    36</span>
<span class="go">192.168.102.122  eth3/eth3              Full      2.0.0.2          128    38</span>
<span class="gp">root@c1$</span> show route table ipv4 unicast ospf   
<span class="go">192.168.1.0/24  [ospf(110)/2]</span>
<span class="go">                &gt; to 192.168.11.11 via eth0/eth0</span>
<span class="go">192.168.2.0/24  [ospf(110)/2]</span>
<span class="go">                &gt; to 192.168.12.22 via eth1/eth1</span>
<span class="go">192.168.3.0/24  [ospf(110)/3]</span>
<span class="go">                &gt; to 192.168.102.122 via eth3/eth3</span>
<span class="go">192.168.13.0/24 [ospf(110)/2]</span>
<span class="go">                &gt; to 192.168.102.122 via eth3/eth3</span>
<span class="go">192.168.21.0/24 [ospf(110)/2]</span>
<span class="go">                &gt; to 192.168.101.133 via eth2/eth2</span>
<span class="go">192.168.22.0/24 [ospf(110)/2]</span>
<span class="go">                &gt; to 192.168.12.22 via eth1/eth1</span>
<span class="go">192.168.23.0/24 [ospf(110)/2]</span>
<span class="go">                &gt; to 192.168.101.133 via eth2/eth2</span>
<span class="go">192.168.103.0/24        [ospf(110)/2]</span>
<span class="go">                &gt; to 192.168.102.122 via eth3/eth3</span>
</pre></div>


<h2 id="multicast-routing">Multicast routing</h2>
<p>Once unicast routing is up and running, we need to setup multicast
routing. There are two protocols for this: <em><abbr title="Internet Group Management Protocol"><abbr title="Internet Group Management Protocol"><span class="caps">IGMP</span></abbr></abbr></em> and <em><abbr title="Protocol Independant Multicast - Sparse Mode"><abbr title="Protocol Independant Multicast - Sparse Mode"><span class="caps">PIM</span>-<span class="caps">SM</span></abbr></abbr></em>. The
former one allows routers to forward multicast datagrams while the
first one allows hosts to subscribe to a multicast&nbsp;group.</p>
<h3 id="igmp"><abbr title="Internet Group Management Protocol"><span class="caps">IGMP</span></abbr></h3>
<p><abbr title="Internet Group Management Protocol"><span class="caps">IGMP</span></abbr> is used by hosts and adjacent routers to establish multicast
group membership. In our case, it will be used by <code>R2</code> to let <code>E2</code>
know it subscribed to <code>239.0.0.11</code> (a multicast&nbsp;group).</p>
<p>Configuring <span class="caps">XORP</span> to support <abbr title="Internet Group Management Protocol"><span class="caps">IGMP</span></abbr> is simple. Let&#8217;s test with <code>iperf</code> to
have a multicast listener on <code>R2</code>:</p>
<div class="codehilite"><pre><span class="gp">root@r2#</span> iperf -u -s -l 1000 -i 1 -B 239.0.0.11
<span class="go">------------------------------------------------------------</span>
<span class="go">Server listening on <span class="caps">UDP</span> port 5001</span>
<span class="go">Binding to local address 239.0.0.11</span>
<span class="go">Joining multicast group  239.0.0.11</span>
<span class="go">Receiving 1000 byte datagrams</span>
<span class="go"><span class="caps">UDP</span> buffer size:  208 KByte (default)</span>
<span class="go">------------------------------------------------------------</span>
</pre></div>


<p>On <code>E2</code>, we can now check that <code>R2</code> is properly registered for
<code>239.0.0.11</code>:</p>
<div class="codehilite"><pre><span class="gp">root@e2$</span> show igmp group
<span class="go">Interface    Group           Source          LastReported Timeout V State</span>
<span class="go">eth0         239.0.0.11      0.0.0.0         192.168.2.2      248 2     E</span>
</pre></div>


<p><span class="caps">XORP</span> documentation contains a <a href="http://xorp.run.montefiore.ulg.ac.be/latex2wiki/user_manual/igmp_and_mld" title="XORP documentation on IGMP and MLD">good overview of <abbr title="Internet Group Management Protocol"><abbr title="Internet Group Management Protocol"><span class="caps">IGMP</span></abbr></abbr></a>.</p>
<h3 id="pim-sm"><abbr title="Protocol Independant Multicast - Sparse Mode"><span class="caps">PIM</span>-<span class="caps">SM</span></abbr></h3>
<p><abbr title="Protocol Independant Multicast - Sparse Mode"><span class="caps">PIM</span>-<span class="caps">SM</span></abbr> is far more complex. It does not have its own topology
discovery protocol and relies on routing information from other
protocols, <abbr title="Open Shortest Path First"><span class="caps">OSPF</span></abbr> in our&nbsp;case.</p>
<p>I will describe here a simplified view on how <abbr title="Protocol Independant Multicast - Sparse Mode"><span class="caps">PIM</span>-<span class="caps">SM</span></abbr> works. <span class="caps">XORP</span>
documentation contains <a href="http://xorp.run.montefiore.ulg.ac.be/latex2wiki/user_manual/pim_sparse_mode" title="XORP documentation on PIM-SM">more details about <abbr title="Protocol Independant Multicast - Sparse Mode"><abbr title="Protocol Independant Multicast - Sparse Mode"><span class="caps">PIM</span>-<span class="caps">SM</span></abbr></abbr></a>.</p>
<p>The first step for all <abbr title="Protocol Independant Multicast - Sparse Mode"><span class="caps">PIM</span>-<span class="caps">SM</span></abbr> routers is to <strong>elect a rendez-vous
point</strong> (<abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr>). In our lab, only <code>C1</code>, <code>C2</code> and <code>C3</code> have been
configured to be elected as a <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr>. Moreover, we give better priority to
<code>C3</code> to ensure it&nbsp;wins.</p>
<p><img alt="RP election" src="//d1g3mdmxf8zbo9.cloudfront.net/images/vxlan/multicast-rp.png" title="C3 has been elected as RP"></p>
<div class="codehilite"><pre><span class="gp">root@e1$</span> show pim rps   
<span class="go"><span class="caps">RP</span>              Type      Pri Holdtime Timeout ActiveGroups GroupPrefix       </span>
<span class="go">192.168.101.133 bootstrap 100      150     135            0 239.0.0.0/8</span>
</pre></div>


<p>Let&#8217;s suppose we start <code>iperf</code> on both <code>R2</code> and <code>R3</code>. Using <abbr title="Internet Group Management Protocol"><span class="caps">IGMP</span></abbr>, they
subscribe to multicast group <code>239.0.0.11</code> with <code>E2</code> and <code>E3</code>
respectively. Then, <code>E2</code> and <code>E3</code> send a join message (also known as a
<em>(*,G) join</em>) to the <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr> (<code>C3</code>) for that multicast group. Using the
unicast path from <code>E2</code> and <code>E3</code> to the <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr>, the <strong>routers along the
paths build the <abbr title="Rendez-vous Point"><abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr></abbr> tree (<abbr title="Rendez-vous Point Tree"><abbr title="Rendez-vous Point Tree"><span class="caps">RPT</span></abbr></abbr>)</strong>, rooted at <code>C3</code>.  Each router in the
tree knows how to send multicast packets to <code>239.0.0.11</code>: it will send
them to the&nbsp;leaves.</p>
<p><img alt="RP tree" src="//d1g3mdmxf8zbo9.cloudfront.net/images/vxlan/multicast-rptree.png" title="RP tree for 239.0.0.11 has been built"></p>
<div class="codehilite"><pre><span class="gp">root@e3$</span> show pim join   
<span class="go">Group           Source          <span class="caps">RP</span>              Flags</span>
<span class="go">239.0.0.11      0.0.0.0         192.168.101.133 <span class="caps">WC</span>   </span>
<span class="go">    Upstream interface (<span class="caps">RP</span>):   eth2</span>
<span class="go">    Upstream <span class="caps">MRIB</span> next hop (<span class="caps">RP</span>): 192.168.23.133</span>
<span class="go">    Upstream <span class="caps">RPF</span>&#39;(*,G):        192.168.23.133</span>
<span class="go">    Upstream state:            Joined </span>
<span class="go">    Join timer:                5</span>
<span class="go">    Local receiver include <span class="caps">WC</span>: O...</span>
<span class="go">    Joins <span class="caps">RP</span>:                  ....</span>
<span class="go">    Joins <span class="caps">WC</span>:                  ....</span>
<span class="go">    Join state:                ....</span>
<span class="go">    Prune state:               ....</span>
<span class="go">    Prune pending state:       ....</span>
<span class="go">    I am assert winner state:  ....</span>
<span class="go">    I am assert loser state:   ....</span>
<span class="go">    Assert winner <span class="caps">WC</span>:          ....</span>
<span class="go">    Assert lost <span class="caps">WC</span>:            ....</span>
<span class="go">    Assert tracking <span class="caps">WC</span>:        <span class="caps">O.O.</span></span>
<span class="go">    Could assert <span class="caps">WC</span>:           O...</span>
<span class="go">    I am <span class="caps">DR</span>:                   O..O</span>
<span class="go">    Immediate olist <span class="caps">RP</span>:        ....</span>
<span class="go">    Immediate olist <span class="caps">WC</span>:        O...</span>
<span class="go">    Inherited olist <span class="caps">SG</span>:        O...</span>
<span class="go">    Inherited olist SG_RPT:    O...</span>
<span class="go">    <span class="caps">PIM</span> include <span class="caps">WC</span>:            O...</span>
</pre></div>


<p>Let&#8217;s suppose that <code>R1</code> wants to send multicast packets to
<code>239.0.0.11</code>. It sends them to <code>R1</code> which does not have any
information on how to contact all the members of the multicast group
because it is not the <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr>. Therefore, it <strong>encapsulates the multicast
packets into <span class="caps">PIM</span> Register packets</strong> and sends them to the <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr>. The <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr>
decapsulates them and sends them natively. The multicast packets are
routed from the <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr> to <code>R2</code> and <code>R3</code> using the reverse path formed by
the join&nbsp;messages.</p>
<p><img alt="Multicast routing via the RP" src="//d1g3mdmxf8zbo9.cloudfront.net/images/vxlan/multicast-register.png" title="R1 sends multicast packets to 239.0.0.11 via the RP"></p>
<div class="codehilite"><pre><span class="gp">root@r1#</span> iperf -c 239.0.0.11 -u -b 10k -t 30 -T 10
<span class="go">------------------------------------------------------------</span>
<span class="go">Client connecting to 239.0.0.11, <span class="caps">UDP</span> port 5001</span>
<span class="go">Sending 1470 byte datagrams</span>
<span class="go">Setting multicast <span class="caps">TTL</span> to 10</span>
<span class="go"><span class="caps">UDP</span> buffer size:  208 KByte (default)</span>
<span class="go">------------------------------------------------------------</span>
<span class="gp">root@e1#</span> tcpdump -pni eth0
<span class="go">10:58:23.424860 <span class="caps">IP</span> 192.168.1.1.35277 &gt; 239.0.0.11.5001: <span class="caps">UDP</span>, length 1470</span>
<span class="gp">root@c3#</span> tcpdump -pni eth0
<span class="go">10:58:23.552903 <span class="caps">IP</span> 192.168.11.11 &gt; 192.168.101.133: PIMv2, Register, length 1480</span>
<span class="gp">root@e2#</span> tcpdump -pni eth0
<span class="go">10:58:23.896171 <span class="caps">IP</span> 192.168.1.1.35277 &gt; 239.0.0.11.5001: <span class="caps">UDP</span>, length 1470</span>
<span class="gp">root@e3#</span> tcpdump -pni eth0
<span class="go">10:58:23.824647 <span class="caps">IP</span> 192.168.1.1.35277 &gt; 239.0.0.11.5001: <span class="caps">UDP</span>, length 1470</span>
</pre></div>


<p>As presented here, the routing is not optimal: packets from <code>R1</code> to
<code>R2</code> could avoid the <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr>. Moreover, encapsulating multicast packets
into unicast packets is not efficient either. At some point, the <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr>
will decide to switch to native multicast<sup id="fnref:native"><a href="#fn:native" rel="footnote">1</a></sup>. Rooted at <code>R1</code>,
<strong>the shortest-path tree (<abbr title="Shortest-path Tree"><abbr title="Shortest-path Tree"><span class="caps">SPT</span></abbr></abbr>) for the multicast group will be built
using source-specific join messages</strong> (also known as a <em>(S,G) join</em>).</p>
<p><img alt="Multicast routing without RP" src="//d1g3mdmxf8zbo9.cloudfront.net/images/vxlan/multicast-native.png" title="R1 sends multicast packets to 239.0.0.11 using native multicast following the shortest-path tree"></p>
<p>From here, each router in the tree knows how to handle multicast
packets from <code>R1</code> to the group without involving the <abbr title="Rendez-vous Point"><span class="caps">RP</span></abbr>. For example,
<code>E1</code> knows it must duplicate the packet and sends one through the
interface to <code>C3</code> and the other one through the interface to <code>C1</code>:</p>
<div class="codehilite"><pre><span class="gp">root@e1$</span> show pim join   
<span class="go">Group           Source          <span class="caps">RP</span>              Flags</span>
<span class="go">239.0.0.11      192.168.1.1     192.168.101.133 <span class="caps">SG</span> <span class="caps">SPT</span> DirectlyConnectedS </span>
<span class="go">    Upstream interface (S):    eth0</span>
<span class="go">    Upstream interface (<span class="caps">RP</span>):   eth1</span>
<span class="go">    Upstream <span class="caps">MRIB</span> next hop (<span class="caps">RP</span>): 192.168.11.111</span>
<span class="go">    Upstream <span class="caps">MRIB</span> next hop (S):  <span class="caps">UNKNOWN</span></span>
<span class="go">    Upstream <span class="caps">RPF</span>&#39;(S,G):        <span class="caps">UNKNOWN</span></span>
<span class="go">    Upstream state:            Joined </span>
<span class="go">    Register state:            RegisterPrune RegisterCouldRegister </span>
<span class="go">    Join timer:                7</span>
<span class="go">    <span class="caps">KAT</span>(S,G) running:          true</span>
<span class="go">    Local receiver include <span class="caps">WC</span>: ....</span>
<span class="go">    Local receiver include <span class="caps">SG</span>: ....</span>
<span class="go">    Local receiver exclude <span class="caps">SG</span>: ....</span>
<span class="go">    Joins <span class="caps">RP</span>:                  ....</span>
<span class="go">    Joins <span class="caps">WC</span>:                  ....</span>
<span class="go">    Joins <span class="caps">SG</span>:                  .<span class="caps">OO</span>.</span>
<span class="go">    Join state:                .<span class="caps">OO</span>.</span>
<span class="go">    Prune state:               ....</span>
<span class="go">    Prune pending state:       ....</span>
<span class="go">    I am assert winner state:  ....</span>
<span class="go">    I am assert loser state:   ....</span>
<span class="go">    Assert winner <span class="caps">WC</span>:          ....</span>
<span class="go">    Assert winner <span class="caps">SG</span>:          ....</span>
<span class="go">    Assert lost <span class="caps">WC</span>:            ....</span>
<span class="go">    Assert lost <span class="caps">SG</span>:            ....</span>
<span class="go">    Assert lost SG_RPT:        ....</span>
<span class="go">    Assert tracking <span class="caps">SG</span>:        <span class="caps">OOO</span>.</span>
<span class="go">    Could assert <span class="caps">WC</span>:           ....</span>
<span class="go">    Could assert <span class="caps">SG</span>:           .<span class="caps">OO</span>.</span>
<span class="go">    I am <span class="caps">DR</span>:                   O..O</span>
<span class="go">    Immediate olist <span class="caps">RP</span>:        ....</span>
<span class="go">    Immediate olist <span class="caps">WC</span>:        ....</span>
<span class="go">    Immediate olist <span class="caps">SG</span>:        .<span class="caps">OO</span>.</span>
<span class="go">    Inherited olist <span class="caps">SG</span>:        .<span class="caps">OO</span>.</span>
<span class="go">    Inherited olist SG_RPT:    ....</span>
<span class="go">    <span class="caps">PIM</span> include <span class="caps">WC</span>:            ....</span>
<span class="go">    <span class="caps">PIM</span> include <span class="caps">SG</span>:            ....</span>
<span class="go">    <span class="caps">PIM</span> exclude <span class="caps">SG</span>:            ....</span>
<span class="gp">root@e1$</span> show pim mfc  
<span class="go">Group           Source          <span class="caps">RP</span>             </span>
<span class="go">239.0.0.11      192.168.1.1     192.168.101.133</span>
<span class="go">    Incoming interface :      eth0</span>
<span class="go">    Outgoing interfaces:      .<span class="caps">OO</span>.</span>
<span class="gp">root@e1$</span> <span class="nb">exit</span>
<span class="go">[Connection to <span class="caps">XORP</span> closed]</span>
<span class="gp">root@e1#</span> ip mroute
<span class="go">(192.168.1.1, 239.0.0.11)        Iif: eth0       Oifs: eth1 eth2</span>
</pre></div>


<h2 id="setting-up-vxlan">Setting up <abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr></h2>
<p>Once <span class="caps">IP</span> multicast is running, setting up <abbr title="Virtual eXtensible Local Area Network"><span class="caps">VXLAN</span></abbr> is quite easy. Here are the software&nbsp;requirements:</p>
<ul>
<li>A recent kernel. Pick at least 3.7-rc3. You need to enable
   <code>CONFIG_VXLAN</code> option. You also currently need a patch on top of it
   to be able to <a href="http://patchwork.ozlabs.org/patch/195622/" title="vxlan: allow a user to set TTL value">specify a <span class="caps">TTL</span> greater than 1</a> for
   multicast&nbsp;packets.</li>
<li>A recent version of <code>ip</code>. Currently, you need the version from <em>git</em>.</li>
</ul>
<p>On <code>R1</code>, <code>R2</code> and <code>R3</code>, we create a <code>vxlan42</code> interface with the following&nbsp;commands:</p>
<div class="codehilite"><pre><span class="gp">root@rX#</span> ./ip link add vxlan42 <span class="nb">type </span>vxlan id 42 <span class="se">\</span>
<span class="gp">&gt;</span>                               group 239.0.0.42 <span class="se">\</span>
<span class="gp">&gt;</span>                               ttl 10 dev eth0
<span class="gp">root@rX#</span> ip link <span class="nb">set </span>up dev vxlan42
<span class="gp">root@rX#</span> ./ip -d link show vxlan42
<span class="go">10: vxlan42: &lt;<span class="caps">BROADCAST</span>,<span class="caps">MULTICAST</span>,<span class="caps">UP</span>,LOWER_UP&gt; mtu 1460 qdisc noqueue state <span class="caps">UNKNOWN</span> mode <span class="caps">DEFAULT</span> </span>
<span class="go">link/ether 3e:09:1c:e1:09:2e brd ff:ff:ff:ff:ff:ff</span>
<span class="go">vxlan id 42 group 239.0.0.42 dev eth0 port 32768 61000 ttl 10 ageing 300</span>
</pre></div>


<p>Let&#8217;s assign an <span class="caps">IP</span> in <code>192.168.99.0/24</code> for each router and check they can ping each&nbsp;other:</p>
<div class="codehilite"><pre><span class="gp">root@r1#</span> ip addr add 192.168.99.1/24 dev vxlan42
<span class="gp">root@r2#</span> ip addr add 192.168.99.2/24 dev vxlan42
<span class="gp">root@r3#</span> ip addr add 192.168.99.3/24 dev vxlan42
<span class="gp">root@r1#</span> ping 192.168.99.2                    
<span class="go"><span class="caps">PING</span> 192.168.99.2 (192.168.99.2) 56(84) bytes of data.</span>
<span class="go">64 bytes from 192.168.99.2: icmp_req=1 ttl=64 time=3.90 ms</span>
<span class="go">64 bytes from 192.168.99.2: icmp_req=2 ttl=64 time=1.38 ms</span>
<span class="go">64 bytes from 192.168.99.2: icmp_req=3 ttl=64 time=1.82 ms</span>

<span class="go">--- 192.168.99.2 ping statistics ---</span>
<span class="go">3 packets transmitted, 3 received, 0% packet loss, time 2003ms</span>
<span class="go">rtt min/avg/max/mdev = 1.389/2.375/3.907/1.098 ms</span>
</pre></div>


<p>We can check the packets are&nbsp;encapsulated:</p>
<div class="codehilite"><pre><span class="gp">root@r1#</span> tcpdump -pni eth0
<span class="go">tcpdump: verbose output suppressed, use -v or -vv for full protocol decode</span>
<span class="go">listening on eth0, link-type <span class="caps">EN10MB</span> (Ethernet), capture size 65535 bytes</span>
<span class="go">11:30:36.561185 <span class="caps">IP</span> 192.168.1.1.43349 &gt; 192.168.2.2.8472: <span class="caps">UDP</span>, length 106</span>
<span class="go">11:30:36.563179 <span class="caps">IP</span> 192.168.2.2.33894 &gt; 192.168.1.1.8472: <span class="caps">UDP</span>, length 106</span>
<span class="go">11:30:37.562677 <span class="caps">IP</span> 192.168.1.1.43349 &gt; 192.168.2.2.8472: <span class="caps">UDP</span>, length 106</span>
<span class="go">11:30:37.564316 <span class="caps">IP</span> 192.168.2.2.33894 &gt; 192.168.1.1.8472: <span class="caps">UDP</span>, length 106</span>
</pre></div>


<p>Moreover, if we send broadcast packets (with <code>ping -b</code> or <span class="caps">ARP</span>
requests), they are encapsulated into multicast&nbsp;packets:</p>
<div class="codehilite"><pre><span class="gp">root@r1#</span> tcpdump -pni eth0
<span class="go">11:31:27.464198 <span class="caps">IP</span> 192.168.1.1.41958 &gt; 239.0.0.42.8472: <span class="caps">UDP</span>, length 106</span>
<span class="go">11:31:28.463584 <span class="caps">IP</span> 192.168.1.1.41958 &gt; 239.0.0.42.8472: <span class="caps">UDP</span>, length 106</span>
</pre></div>


<p>Recent versions of <code>iproute</code> also comes with <code>bridge</code>, an utility
allowing one to inspect the <abbr title="Forwarding Database"><span class="caps">FDB</span></abbr> of bridge-like&nbsp;interfaces:</p>
<div class="codehilite"><pre><span class="gp">root@r1#</span> ../bridge/bridge fdb show vxlan42
<span class="go">3e:09:1c:e1:09:2e dev vxlan42 dst 192.168.2.2 self </span>
<span class="go">0e:98:40:c6:58:10 dev vxlan42 dst 192.168.3.3 self</span>
</pre></div>


<h1 id="demo">Demo</h1>
<p>For a demo, have a look at the following video (it is also available
as an <a href="//media.luffy.cx/videos/2012-multicast-vxlan.ogv">Ogg Theora video</a>).</p>
<div class="lf-video-container"><div class="lf-video">
<iframe frameborder="0" width="480" height="270"
        src="//www.dailymotion.com/embed/video/xusell"></iframe>
</div></div>

<div class="footnote">
<hr>
<ol>
<li id="fn:native">
<p>The decision is usually done when the bandwidth used by the
       follow reachs some threshold. With <span class="caps">XORP</span>, this can be
       controlled with <code>switch-to-spt-threshold</code>. However, I was
       unable to make this works as expected. <span class="caps">XORP</span> never sends the
       appropriate <span class="caps">PIM</span> packets to make the switch. Therefore, for
       this lab, it has been configured to switch to native
       multicast at the first received packet.&#160;<a href="#fnref:native" rev="footnote" title="Jump back to footnote 1 in the text">&#8617;</a></p>
</li>
</ol>
</div>
]]>
            </content>
        </entry>
                <entry>
            <title type="html">Network lab with KVM</title>
            <author><name>Vincent Bernat</name></author>
            <link href="http://vincent.bernat.im/en/blog/2012-network-lab-kvm.html"/>
            <updated>2012-10-20T18:17:50+02:00</updated>
            <id>http://www.luffy.cx/en/blog/2012-network-lab-kvm.html</id>

            <content type="html">
<![CDATA[
<p>To experiment with network stuff, I was using
<a href="/en/blog/2011-uml-network-lab.html" title="Network lab with UML"><abbr title="User Mode Linux"><abbr title="User Mode Linux"><span class="caps">UML</span></abbr></abbr>-based network labs</a>. Many alternatives exist, like
<a href="http://www.gns3.net/" title="GNS3: Graphical Network Simulator"><span class="caps">GNS3</span></a>, <a href="http://www.netkit.org" title="Netkit: the poor man's system to experiment computer networking">Netkit</a>, <a href="http://www.marionnet.org/" title="Marionnet: a virtual network laboratory">Marionnet</a> or <a href="http://clownix.net/" title="Cloonix: dynamical topology virtual networks">Cloonix</a>. All of them are
great viable solutions but I still prefer to stick to my minimal
home-made solution with <abbr title="User Mode Linux"><span class="caps">UML</span></abbr> virtual machines. Here is&nbsp;why:</p>
<ul>
<li>I didn&#8217;t want to use <strong>disk images</strong>. They take a lot of space and
   they have to be maintained. They also become cluttered, especially
   if you try to reuse them across several labs. They are also
   difficult to&nbsp;share.</li>
<li>I want to be able to access my <strong>home directory</strong>. It contains the
   important configuration files related to the lab and I can put them
   in the right place thanks to symbolic links when the lab starts. It
   also makes exchanging files with the lab quite&nbsp;easy.</li>
<li>I don&#8217;t want to <strong>boot a complete system</strong>. This allows me to be
   cheap on memory and each virtual system should boot in a few&nbsp;seconds.</li>
</ul>
<p>The use of <abbr title="User Mode Linux"><span class="caps">UML</span></abbr> had some&nbsp;drawbacks:</p>
<ul>
<li>It may be buggy. For example, it is currently not possible to use
   <a href="http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=676184" title="Bug #676184: gdbserver not usable inside UML">gdbserver inside <abbr title="User Mode Linux"><abbr title="User Mode Linux"><span class="caps">UML</span></abbr></abbr></a> without a patch. Sometimes, the kernel
   won&#8217;t even&nbsp;compile.</li>
<li>It is&nbsp;slow.</li>
</ul>
<p>However, <abbr title="User Mode Linux"><span class="caps">UML</span></abbr> features <a href="http://user-mode-linux.sourceforge.net/hostfs.html" title="UML: Host File Access">HostFS</a>, a filesystem providing access to any
part of the host filesystem. This is the killer feature which allows
me to not use any virtual disk image and to get access to my home
directory right from the&nbsp;guest.</p>
<p>I discovered recently that <abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr> provided <a href="http://wiki.qemu.org/Documentation/9psetup" title="Setting up 9P in KVM">9P</a>, a similar filesystem
on top of VirtIO, the paravirtualized <span class="caps">IO</span>&nbsp;framework.</p>
<div class="toc">
<ul>
<li><a href="#setting-up-the-lab">Setting up the lab</a><ul>
<li><a href="#booting-kvm-with-a-minimal-kernel">Booting <span class="caps">KVM</span> with a minimal kernel</a></li>
<li><a href="#initial-ramdisk">Initial ramdisk</a></li>
<li><a href="#root-filesystem">Root filesystem</a></li>
<li><a href="#network">Network</a></li>
</ul>
</li>
<li><a href="#debugging">Debugging</a><ul>
<li><a href="#kernel-debugging">Kernel debugging</a></li>
<li><a href="#userland-debugging">Userland debugging</a></li>
</ul>
</li>
<li><a href="#demo">Demo</a></li>
</ul>
</div>
<h1 id="setting-up-the-lab">Setting up the lab</h1>
<p>The setup of the lab is done with a
<a href="https://github.com/vincentbernat/network-lab/blob/master/lab-ecmp-ipv6/setup">single self-contained shell file</a>. The layout is similar to
<a href="/en/blog/2011-uml-network-lab.html" title="Network lab with UML">what I have done with <abbr title="User Mode Linux"><abbr title="User Mode Linux"><span class="caps">UML</span></abbr></abbr></a>. I will only highlight here the
most interesting&nbsp;steps.</p>
<h2 id="booting-kvm-with-a-minimal-kernel">Booting <abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr> with a minimal kernel</h2>
<p>My initial goal was to experiment with
<a href="http://patchwork.ozlabs.org/patch/188562/" title="ipv6: add support of ECMP">Nicolas Dichtel&#8217;s IPv6 <span class="caps">ECMP</span> patch</a>. Therefore, I needed to
configure a custom kernel. I have started from <code>make defconfig</code>,
removed everything that was not necessary, added what I needed for my
lab (mostly network stuff) and added the appropriate options for VirtIO&nbsp;drivers:</p>
<div class="codehilite"><pre><span class="n">CONFIG_NET_9P_VIRTIO</span><span class="p">=</span><span class="n">y</span>
<span class="n">CONFIG_VIRTIO_BLK</span><span class="p">=</span><span class="n">y</span>
<span class="n">CONFIG_VIRTIO_NET</span><span class="p">=</span><span class="n">y</span>
<span class="n">CONFIG_VIRTIO_CONSOLE</span><span class="p">=</span><span class="n">y</span>
<span class="n">CONFIG_HW_RANDOM_VIRTIO</span><span class="p">=</span><span class="n">y</span>
<span class="n">CONFIG_VIRTIO</span><span class="p">=</span><span class="n">y</span>
<span class="n">CONFIG_VIRTIO_RING</span><span class="p">=</span><span class="n">y</span>
<span class="n">CONFIG_VIRTIO_PCI</span><span class="p">=</span><span class="n">y</span>
<span class="n">CONFIG_VIRTIO_BALLOON</span><span class="p">=</span><span class="n">y</span>
<span class="n">CONFIG_VIRTIO_MMIO</span><span class="p">=</span><span class="n">y</span>
</pre></div>


<p>No modules. Grab the <a href="https://github.com/vincentbernat/network-lab/blob/master/lab-ecmp-ipv6/config-3.6%2Becmp%2Boverlayfs" title="Minimal .config">complete configuration</a> if you want to
have a&nbsp;look.</p>
<p>From here, you can start your kernel with the following command
(<code>$LINUX</code> is the appropriate <code>bzImage</code>):</p>
<div class="codehilite"><pre>kvm <span class="se">\</span>
  -m 256m <span class="se">\</span>
  -display none <span class="se">\</span>
  -nodefconfig -no-user-config -nodefaults <span class="se">\</span>
  <span class="se">\</span>
  -chardev stdio,id<span class="o">=</span>charserial0,signal<span class="o">=</span>off <span class="se">\</span>
  -device isa-serial,chardev<span class="o">=</span>charserial0,id<span class="o">=</span>serial0 <span class="se">\</span>
  <span class="se">\</span>
  -chardev socket,id<span class="o">=</span>con0,path<span class="o">=</span><span class="nv">$<span class="caps">TMP</span></span>/vm-<span class="nv">$name</span>-console.pipe,server,nowait <span class="se">\</span>
  -mon <span class="nv">chardev</span><span class="o">=</span>con0,mode<span class="o">=</span>readline,default <span class="se">\</span>
  <span class="se">\</span>
  -kernel <span class="nv">$<span class="caps">LINUX</span></span> <span class="se">\</span>
  -append <span class="s2">&quot;init=/bin/sh console=ttyS0&quot;</span>
</pre></div>


<p>Of course, since there is no disk to boot from, the kernel will panic
when trying to mount the root filesystem. <abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr> is configured to not
display video output (<code>-display none</code>). A serial port is defined and
uses <code>stdio</code> as a backend<sup id="fnref:signal"><a href="#fn:signal" rel="footnote">1</a></sup>. The kernel is configured to use
this serial port as a console (<code>console=ttyS0</code>). A VirtIO console
could have been used instead but it seems this is not possible to make
it work early in the boot&nbsp;process.</p>
<p>The <abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr> monitor is setup to listen on an Unix socket. It is possible
to connect to it with <code>socat UNIX:$TMP/vm-$name-console.pipe -</code>.</p>
<h2 id="initial-ramdisk">Initial ramdisk</h2>
<p><strong><span class="caps">UPDATED</span>:</strong> I was initially unable to mount the host filesystem as
the root filesystem for the guest directly by the kernel. In a
comment, <a href="http://joshtriplett.org/">Josh Triplett</a> told me to use
<code>/dev/root</code> as the mount tag to solve this problem. I keep using an
<em>initrd</em> in this post but the lab on <a href="https://github.com/vincentbernat/network-lab/blob/master/lab-ecmp-ipv6/setup">Github</a> has been updated
to not use&nbsp;one.</p>
<p>Here is how to build a small initial&nbsp;ramdisk:</p>
<div class="codehilite"><pre><span class="c"># Setup initrd</span>
setup_initrd<span class="o">()</span> <span class="o">{</span>
    info <span class="s2">&quot;Build initrd&quot;</span>
    <span class="nv"><span class="caps">DESTDIR</span></span><span class="o">=</span><span class="nv">$<span class="caps">TMP</span></span>/initrd
    mkdir -p <span class="nv">$<span class="caps">DESTDIR</span></span>

    <span class="c"># Setup busybox</span>
    copy_exec <span class="k">$(</span><span class="nv">$<span class="caps">WHICH</span></span> busybox<span class="k">)</span> /bin/busybox
    <span class="k">for </span>applet in <span class="k">$(${</span><span class="nv"><span class="caps">DESTDIR</span></span><span class="k">}</span>/bin/busybox --list<span class="k">)</span>; <span class="k">do</span>
<span class="k">        </span>ln -s busybox <span class="k">${</span><span class="nv"><span class="caps">DESTDIR</span></span><span class="k">}</span>/bin/<span class="k">${</span><span class="nv">applet</span><span class="k">}</span>
    <span class="k">done</span>

    <span class="c"># Setup init</span>
    cp <span class="nv">$<span class="caps">PROGNAME</span></span> <span class="k">${</span><span class="nv"><span class="caps">DESTDIR</span></span><span class="k">}</span>/init

    <span class="nb">cd</span> <span class="s2">&quot;${<span class="caps">DESTDIR</span>}&quot;</span> <span class="o">&amp;&amp;</span> find . | <span class="se">\</span>
       cpio --quiet -R 0:0 -o -H newc | <span class="se">\</span>
       gzip &gt; <span class="nv">$<span class="caps">TMP</span></span>/initrd.gz
<span class="o">}</span>
</pre></div>


<p>The <code>copy_exec</code> function is stolen from the <code>initramfs-tools</code> package
in Debian. It will ensure that the appropriate libraries are also
copied. Another solution would have been to use a static <code>busybox</code>.</p>
<p>The setup script is copied as <code>/init</code> in the initial ramdisk. It will
detect it has been invoked as such. If it was omitted, a shell would
be spawned instead. Remove the <code>cp</code> call if you want to experiment&nbsp;manually.</p>
<p>The flag <code>-initrd</code> allows <abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr> to use this initial&nbsp;ramdisk.</p>
<h2 id="root-filesystem">Root filesystem</h2>
<p>Let&#8217;s mount our root filesystem using <em>9P</em>. This is quite easy. First
<abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr> needs to be configured to export the host filesystem to
the&nbsp;guest:</p>
<div class="codehilite"><pre>kvm <span class="se">\</span>
  <span class="k">${</span><span class="nv">PREVIOUS_ARGS</span><span class="k">}</span> <span class="se">\</span>
  -fsdev <span class="nb">local</span>,security_model<span class="o">=</span>passthrough,id<span class="o">=</span>fsdev-root,path<span class="o">=</span><span class="k">${</span><span class="nv"><span class="caps">ROOT</span></span><span class="k">}</span>,readonly <span class="se">\</span>
  -device virtio-9p-pci,id<span class="o">=</span>fs-root,fsdev<span class="o">=</span>fsdev-root,mount_tag<span class="o">=</span>rootshare
</pre></div>


<p><code>${ROOT}</code> can either be <code>/</code> or any directory containing a complete
filesystem. Mounting it from the guest is quite&nbsp;easy:</p>
<div class="codehilite"><pre>mkdir -p /target/ro
mount -t 9p rootshare /target/ro -o <span class="nv">trans</span><span class="o">=</span>virtio,version<span class="o">=</span>9p2000.u
</pre></div>


<p>You should find a complete root filesystem inside <code>/target/ro</code>. I have
used <code>version=9p2000.u</code> instead of <code>version=9p2000.L</code> because the
later does not allow a program to <code>mount()</code> a host mount
point<sup id="fnref:9pversion"><a href="#fn:9pversion" rel="footnote">2</a></sup>.</p>
<p>Now, you have a read-only root filesystem (because you don&#8217;t want to
mess with your existing root filesystem and moreover, you did not run
this lab as root, did you?). Let&#8217;s use an union filesystem. Debian
comes with <a href="http://aufs.sourceforge.net/" title="AUFS: advanced multi-layered unification filesystem"><span class="caps">AUFS</span></a> while Ubuntu and OpenWRT have migrated to
<a href="http://git.kernel.org/?p=linux/kernel/git/mszeredi/vfs.git;a=summary" title="overlayfs git repository">overlayfs</a>. I was previously using <span class="caps">AUFS</span> but got errors on some
specific cases. It is still not clear
<a href="http://lwn.net/Articles/447650/" title="LWN: Debating overlayfs">which one will end up in the kernel</a>. So, let&#8217;s try
<em>overlayfs</em>.</p>
<p>I didn&#8217;t find any patchset ready to be applied on top of my kernel
tree. I was working with <a href="http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=summary" title="net-next git repository">David Miller&#8217;s net-next tree</a>. Here
is how I have applied the <em>overlayfs</em> patch on top of&nbsp;it:</p>
<div class="codehilite"><pre><span class="gp">$</span> git remote add torvalds git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
<span class="gp">$</span> git fetch torvalds
<span class="gp">$</span> git remote add overlayfs git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git
<span class="gp">$</span> git fetch overlayfs
<span class="gp">$</span> git merge-base overlayfs.v15 v3.6
<span class="go">4cbe5a555fa58a79b6ecbb6c531b8bab0650778d</span>
<span class="gp">$</span> git checkout -b net-next+overlayfs
<span class="gp">$</span> git cherry-pick 4cbe5a555fa58a79b6ecbb6c531b8bab0650778d..overlayfs.v15
</pre></div>


<p>Don&#8217;t forget to enable <code>CONFIG_OVERLAYFS_FS</code> in <code>.config</code>. Here is how
I configured the whole root&nbsp;filesystem:</p>
<div class="codehilite"><pre>info <span class="s2">&quot;Setup overlayfs&quot;</span>
mkdir /target
mkdir /target/ro
mkdir /target/rw
mkdir /target/overlay
<span class="c"># Version 9p2000.u allows to access /dev, /sys and mount new</span>
<span class="c"># partitions over them. This is not the case for 9p2000.L.</span>
mount -t 9p        rootshare /target/ro      -o <span class="nv">trans</span><span class="o">=</span>virtio,version<span class="o">=</span>9p2000.u
mount -t tmpfs     tmpfs     /target/rw      -o rw
mount -t overlayfs overlayfs /target/overlay -o <span class="nv">lowerdir</span><span class="o">=</span>/target/ro,upperdir<span class="o">=</span>/target/rw
mount -n -t proc  proc /target/overlay/proc
mount -n -t sysfs sys  /target/overlay/sys

info <span class="s2">&quot;Mount home directory on /root&quot;</span>
mount -t 9p homeshare /target/overlay/root -o <span class="nv">trans</span><span class="o">=</span>virtio,version<span class="o">=</span>9p2000.L,access<span class="o">=</span>0,rw

info <span class="s2">&quot;Mount lab directory on /lab&quot;</span>
mkdir /target/overlay/lab
mount -t 9p labshare /target/overlay/lab -o <span class="nv">trans</span><span class="o">=</span>virtio,version<span class="o">=</span>9p2000.L,access<span class="o">=</span>0,rw

info <span class="s2">&quot;Chroot&quot;</span>
<span class="nb">export </span><span class="nv"><span class="caps">STATE</span></span><span class="o">=</span>1
cp <span class="s2">&quot;$<span class="caps">PROGNAME</span>&quot;</span> /target/overlay
<span class="nb">exec </span>chroot /target/overlay <span class="s2">&quot;$<span class="caps">PROGNAME</span>&quot;</span>
</pre></div>


<p>You have to export your <code>${HOME}</code> and the lab directory from&nbsp;host:</p>
<div class="codehilite"><pre>kvm <span class="se">\</span>
  <span class="k">${</span><span class="nv">PREVIOUS_ARGS</span><span class="k">}</span> <span class="se">\</span>
  -fsdev <span class="nb">local</span>,security_model<span class="o">=</span>passthrough,id<span class="o">=</span>fsdev-root,path<span class="o">=</span><span class="k">${</span><span class="nv"><span class="caps">ROOT</span></span><span class="k">}</span>,readonly <span class="se">\</span>
  -device virtio-9p-pci,id<span class="o">=</span>fs-root,fsdev<span class="o">=</span>fsdev-root,mount_tag<span class="o">=</span>rootshare <span class="se">\</span>
  -fsdev <span class="nb">local</span>,security_model<span class="o">=</span>none,id<span class="o">=</span>fsdev-home,path<span class="o">=</span><span class="k">${</span><span class="nv"><span class="caps">HOME</span></span><span class="k">}</span> <span class="se">\</span>
  -device virtio-9p-pci,id<span class="o">=</span>fs-home,fsdev<span class="o">=</span>fsdev-home,mount_tag<span class="o">=</span>homeshare <span class="se">\</span>
  -fsdev <span class="nb">local</span>,security_model<span class="o">=</span>none,id<span class="o">=</span>fsdev-lab,path<span class="o">=</span><span class="k">$(</span>dirname <span class="s2">&quot;$<span class="caps">PROGNAME</span>&quot;</span><span class="k">)</span> <span class="se">\</span>
  -device virtio-9p-pci,id<span class="o">=</span>fs-lab,fsdev<span class="o">=</span>fsdev-lab,mount_tag<span class="o">=</span>labshare
</pre></div>


<h2 id="network">Network</h2>
<p>You know what is missing from our network lab? Network setup. For each
<span class="caps">LAN</span> that I will need, I spawn a <span class="caps">VDE</span>&nbsp;switch:</p>
<div class="codehilite"><pre><span class="c"># Setup a <span class="caps">VDE</span> switch</span>
setup_switch<span class="o">()</span> <span class="o">{</span>
    info <span class="s2">&quot;Setup switch $1&quot;</span>
    screen -t <span class="s2">&quot;sw-$1&quot;</span> <span class="se">\</span>
        start-stop-daemon --make-pidfile --pidfile <span class="s2">&quot;$<span class="caps">TMP</span>/switch-$1.pid&quot;</span> <span class="se">\</span>
        --start --startas <span class="k">$(</span><span class="nv">$<span class="caps">WHICH</span></span> vde_switch<span class="k">)</span> -- <span class="se">\</span>
        --sock <span class="s2">&quot;$<span class="caps">TMP</span>/switch-$1.sock&quot;</span>
    screen -X <span class="k">select </span>0
<span class="o">}</span>
</pre></div>


<p>To attach an interface to the newly created <span class="caps">LAN</span>, I&nbsp;use:</p>
<div class="codehilite"><pre><span class="nv">mac</span><span class="o">=</span><span class="k">$(</span><span class="nb">echo</span> <span class="nv">$name</span>-<span class="nv">$net</span> | sha1sum | <span class="se">\</span>
            awk <span class="s1">&#39;{print &quot;52:54:&quot; substr($1,0,2) &quot;:&quot; substr($1, 2, 2) &quot;:&quot; substr($1, 4, 2) &quot;:&quot; substr($1, 6, 2)}&#39;</span><span class="k">)</span>
kvm <span class="se">\</span>
  <span class="k">${</span><span class="nv">PREVIOUS_ARGS</span><span class="k">}</span> <span class="se">\</span>
  -net nic,model<span class="o">=</span>virtio,macaddr<span class="o">=</span><span class="nv">$mac</span>,vlan<span class="o">=</span><span class="nv">$net</span> <span class="se">\</span>
  -net vde,sock<span class="o">=</span><span class="nv">$<span class="caps">TMP</span></span>/switch-<span class="nv">$net</span>.sock,vlan<span class="o">=</span><span class="nv">$net</span>
</pre></div>


<p>The use of a <span class="caps">VDE</span> switch allows me to run the lab as a non-root
user. It is possible to give Internet access to each <abbr title="Virtual Machine"><span class="caps">VM</span></abbr>, either by
using <code>-net user</code> flag or using <code>slirpvde</code> on a special switch. I
prefer the latest solution since it will allow the <abbr title="Virtual Machine"><span class="caps">VM</span></abbr> to speak to each&nbsp;others.</p>
<h1 id="debugging">Debugging</h1>
<p>This lab was mostly done to debug both the kernel and Quagga. Each of
them can be debugged&nbsp;remotely.</p>
<h2 id="kernel-debugging">Kernel debugging</h2>
<p>While the kernel features <a href="http://kgdb.linsyssoft.com/" title="KGDB: Linux Kernel Source Level Debugger"><span class="caps">KGDB</span></a>, its own debugger, compatible with
<a href="http://www.gnu.org/software/gdb/" title="GDB: The GNU Project Debugger"><span class="caps">GDB</span></a>, it is easier to use the <em>remote <span class="caps">GDB</span> server</em> built inside
<abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr>.</p>
<div class="codehilite"><pre>kvm <span class="se">\</span>
  <span class="k">${</span><span class="nv">PREVIOUS_ARGS</span><span class="k">}</span> <span class="se">\</span>
  -gdb unix:<span class="nv">$<span class="caps">TMP</span></span>/vm-<span class="nv">$name</span>-gdb.pipe,server,nowait
</pre></div>


<p>To connect to the remote <span class="caps">GDB</span> server from the host, first locate the
<code>vmlinux</code> file at the root of the source tree and run <span class="caps">GDB</span> on it. The
kernel has to be compiled with <code>CONFIG_DEBUG_INFO=y</code> to get the
appropriate debugging symbols. Then, use <code>socat</code> with the Unix socket
to attach to the remote&nbsp;debugger:</p>
<div class="codehilite"><pre><span class="gp">$</span> gdb vmlinux
<span class="go"><span class="caps">GNU</span> gdb (<span class="caps">GDB</span>) 7.4.1-debian</span>
<span class="go">Reading symbols from /home/bernat/src/linux/vmlinux...done.</span>
<span class="go">(gdb) target remote | socat <span class="caps">UNIX</span>:$<span class="caps">TMP</span>/vm-$name-gdb.pipe -</span>
<span class="go">Remote debugging using | socat <span class="caps">UNIX</span>:/tmp/tmp.W36qWnrCEj/vm-r1-gdb.pipe -</span>
<span class="go">native_safe_halt () at /home/bernat/src/linux/arch/x86/include/asm/irqflags.h:50</span>
<span class="go">50  }</span>
<span class="go">(gdb)</span>
</pre></div>


<p>You can now set breakpoints and resume the execution of the&nbsp;kernel.</p>
<p>It is easier to debug the kernel if optimizations are not
enabled. However, it is not possible to
<a href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20217" title="Bug 20217 - Switching off the optimization triggers undefined reference at link time when building Linux kernel">disable them globally</a>. You can however disable them
for some files. For example, to debug <code>net/ipv6/route.c</code>, just add
<code>CFLAGS_route.o = -O0</code> to <code>net/ipv6/Makefile</code>, remove
<code>net/ipv6/route.o</code> and type <code>make</code>.</p>
<h2 id="userland-debugging">Userland debugging</h2>
<p>To debug a program inside <abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr>, you can just use <code>gdb</code> as usual. Your
<code>$HOME</code> directory is available and it should be therefore
straightforward. However, if you want to perform some remote
debugging, that&#8217;s quite easy. Add a new serial port to <abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr>:</p>
<div class="codehilite"><pre>kvm <span class="se">\</span>
  <span class="k">${</span><span class="nv">PREVIOUS_ARGS</span><span class="k">}</span> <span class="se">\</span>
  -chardev socket,id<span class="o">=</span>charserial1,path<span class="o">=</span><span class="nv">$<span class="caps">TMP</span></span>/vm-<span class="nv">$name</span>-serial.pipe,server,nowait <span class="se">\</span>
  -device isa-serial,chardev<span class="o">=</span>charserial1,id<span class="o">=</span>serial1
</pre></div>


<p>Starts <code>gdbserver</code> in the&nbsp;guest:</p>
<div class="codehilite"><pre><span class="gp">$</span> libtool execute gdbserver /dev/ttyS1 zebra/zebra
<span class="go">Process /root/code/orange/quagga/build/zebra/.libs/lt-zebra created; pid = 800</span>
<span class="go">Remote debugging using /dev/ttyS1</span>
</pre></div>


<p>And from the host, you can attach to the remote&nbsp;process:</p>
<div class="codehilite"><pre><span class="gp">$</span> libtool execute gdb zebra/zebra
<span class="go"><span class="caps">GNU</span> gdb (<span class="caps">GDB</span>) 7.4.1-debian</span>
<span class="go">Reading symbols from /home/bernat/code/orange/quagga/build/zebra/.libs/lt-zebra...done.</span>
<span class="go">(gdb) target remote | socat <span class="caps">UNIX</span>:/tmp/tmp.W36qWnrCEj/vm-r1-serial.pipe</span>
<span class="go">Remote debugging using | socat <span class="caps">UNIX</span>:/tmp/tmp.W36qWnrCEj/vm-r1-serial.pipe</span>
<span class="go">Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.</span>
<span class="go">Loaded symbols for /lib64/ld-linux-x86-64.so.2</span>
<span class="go">0x00007ffff7dddaf0 in ?? () from /lib64/ld-linux-x86-64.so.2</span>
<span class="go">(gdb)</span>
</pre></div>


<h1 id="demo">Demo</h1>
<p>For a demo, have a look at the following video (it is also available
as an <a href="//media.luffy.cx/videos/2012-network-lab-kvm.ogv">Ogg Theora video</a>).</p>
<div class="lf-video-container"><div class="lf-video">
<iframe frameborder="0" width="480" height="270"
        src="//www.dailymotion.com/embed/video/xuglsg"></iframe>
</div></div>

<div class="footnote">
<hr>
<ol>
<li id="fn:signal">
<p><code>stdio</code> is configured such that signals are not
       enabled. <abbr title="Kernel-based Virtual Machine"><abbr title="Kernel-based Virtual Machine"><abbr title="Kernel-based Virtual Machine"><span class="caps">KVM</span></abbr></abbr></abbr> won&#8217;t stop when receiving <code>SIGINT</code>. This is
       important for the usage we want to have.&#160;<a href="#fnref:signal" rev="footnote" title="Jump back to footnote 1 in the text">&#8617;</a></p>
</li>
<li id="fn:9pversion">
<p>Therefore, it is not possible to mound a fresh <code>/proc</code>
          on top of the existing one. I have searched a bit but
          didn&#8217;t find why. Any comments on this is welcome.&#160;<a href="#fnref:9pversion" rev="footnote" title="Jump back to footnote 2 in the text">&#8617;</a></p>
</li>
</ol>
</div>
]]>
            </content>
        </entry>
                <entry>
            <title type="html">Switching to the awesome window manager</title>
            <author><name>Vincent Bernat</name></author>
            <link href="http://vincent.bernat.im/en/blog/2012-awesome-wm.html"/>
            <updated>2012-07-28T20:18:09+02:00</updated>
            <id>http://www.luffy.cx/en/blog/2012-awesome-wm.html</id>

            <content type="html">
<![CDATA[
<p>I have happily <a href="/en/blog/2011-fvwm-configuration.html" title="My FVWM configuration">used <abbr title="F? Virtual Window Manager"><abbr title="F? Virtual Window Manager"><span class="caps">FVWM</span></abbr></abbr></a> as my window manager for more than 10
years. However, I recently got tired of manually arranging windows and
using the mouse so much. A window manager is one of the handful pieces
of software getting in your way at every moment which explains why
there are so many of them and why we might put so much time in&nbsp;it.</p>
<p>I decided to try a <a href="http://en.wikipedia.org/wiki/Tiling_window_manager" title="Tiling window manager on Wikipedia">tiling window manager</a>. While <a href="http://i3wm.org/" title="i3: improved tiling wm">i3</a> seemed
pretty hot and powerful (watch the <a href="http://i3wm.org/screenshots/" title="Screencast of i3 window manager">screencast</a>!), I
really wanted something configurable and extensible with some
language. So far, the common choices&nbsp;are:</p>
<ul>
<li><a href="http://awesome.naquadah.org" title="The awesome window manager">awesome</a>, written in C, configurable and extensible in <a href="http://www.lua.org" title="The programming language Lua">Lua</a>,</li>
<li><a href="http://www.nongnu.org/stumpwm/" title="The Stump Window Manager">StumpWM</a>, written, configurable and extensible in <a href="http://en.wikipedia.org/wiki/Common_Lisp" title="Common Lisp on Wikipedia">Common Lisp</a>,</li>
<li><a href="http://xmonad.org/" title="xmonad, the tiling window manager that rocks">xmonad</a>, written, configurable and extensible in <a href="http://www.haskell.org" title="The Haskell Programming Language">Haskell</a>.</li>
</ul>
<p>I chose <em>awesome</em>, despite the fact that <em>StumpWM</em> vote for Lisp
seemed a better fit (but it is more minimalist). I hope there is some
parallel universe where I enjoy <em>StumpWM</em>.</p>
<div class="toc">
<ul>
<li><a href="#awesome-configuration">Awesome configuration</a><ul>
<li><a href="#keybindings">Keybindings</a></li>
<li><a href="#quake-console">Quake console</a></li>
<li><a href="#xrandr">XRandR</a></li>
<li><a href="#widgets">Widgets</a></li>
</ul>
</li>
<li><a href="#miscellaneous">Miscellaneous</a><ul>
<li><a href="#keyboard-configuration">Keyboard configuration</a></li>
<li><a href="#getting-rid-of-most-gnome-stuff">Getting rid of most <span class="caps">GNOME</span> stuff</a></li>
<li><a href="#terminal-color-scheme">Terminal color scheme</a></li>
</ul>
</li>
<li><a href="#next-steps">Next steps</a></li>
</ul>
</div>
<p>Visually, here is what I got so&nbsp;far:</p>
<p><a href="//d1g3mdmxf8zbo9.cloudfront.net/images/awesome/screenshot-2012-07-21--20-18-45.jpg"><img alt="awesome dual screen setup" src="//d1g3mdmxf8zbo9.cloudfront.net/images/awesome/screenshot-2012-07-21--20-18-45-small.jpg" title="Dualhead setup with awesome"></a></p>
<h1 id="awesome-configuration">Awesome configuration</h1>
<p>Without a configuration file, <em>awesome</em> does nothing. It does not come
with any hard-coded behavior: <strong>everything needs to be configured</strong>
through its <em>Lua</em> configuration file. Of course, a default one is
provided but you can also start from scratch. If you like to control
your window manager, this is somewhat&nbsp;wonderful.</p>
<p><em>awesome</em> is well documented. The <a href="http://awesome.naquadah.org/wiki/Main_Page" title="awesome wiki">wiki</a> provides a <a href="http://awesome.naquadah.org/wiki/FAQ" title="awesome FAQ"><span class="caps">FAQ</span></a>, a
<a href="http://awesome.naquadah.org/wiki/My_first_awesome" title="awesome tutorial">good introduction</a> and the <a href="file:///usr/share/doc/awesome/luadoc/index.html" title="awesome API reference"><span class="caps">API</span> reference</a> is concise enough to
be read from the top to the bottom. Knowing <em>Lua</em> is not mandatory since
it is quite easy to dive into such a&nbsp;language.</p>
<p>I have posted my configuration on <a href="https://github.com/vincentbernat/awesome-configuration" title="My awesome configuration on GitHub">GitHub</a>. It should not be used as
is but some snippets may be worth to be stolen and adapted into your
own configuration. The following sections put light on some notable&nbsp;points.</p>
<h2 id="keybindings">Keybindings</h2>
<p>Ten years ago was the epoch of scavanger hunts to recover
<a href="http://en.wikipedia.org/wiki/Model_M_keyboard" title="IBM Model M keyboard on Wikipedia"><span class="caps">IBM</span> Model M keyboards</a> from waste containers. They were great to
type on and they did not feature the infamous Windows keys. Nowadays,
this is harder to get such a keyboard. All my keyboards now have
Windows keys. This is a major change with respect to configure a
window manager: the left Windows key is mapped to <code>Mod4</code> and is
usually unused by most applications and can therefore be dedicated to
the window&nbsp;manager.</p>
<p>The main problem with the ability to define many keybindings is to
remember the less frequently used one. I have monkey-patched
<code>awful.key</code> module to be able to attach a documentation string to a
keybinding. I have documented the whole process on the
<a href="http://awesome.naquadah.org/wiki/Document_keybindings" title="Documenting keybindings in awesome wiki"><em>awesome</em> wiki</a>.</p>
<p><img alt="awesome online help" src="//d1g3mdmxf8zbo9.cloudfront.net/images/awesome/keybindings.jpg" title="Contextual help for available keybindings"></p>
<h2 id="quake-console">Quake console</h2>
<p>A <em>Quake console</em> is a drop-down terminal which can be toggled with
some key. I was heavily relying on it in <abbr title="F? Virtual Window Manager"><span class="caps">FVWM</span></abbr>. I think this is still a
useful addition to any <em>awesome</em> configuration. There are several
possible solutions documented in the <a href="http://awesome.naquadah.org/wiki/Drop-down_terminal" title="Drop-down terminal on awesome wiki"><em>awesome</em> wiki</a>. I
have added my own<sup id="fnref:own"><a href="#fn:own" rel="footnote">1</a></sup> which works great for&nbsp;me.</p>
<p><img alt="Quake console" src="//d1g3mdmxf8zbo9.cloudfront.net/images/awesome/quake.jpg" title="Quake console at the top of the screen"></p>
<h2 id="xrandr">XRandR</h2>
<p>XRandR is an extension which allows to dynamically reconfigure
outputs: you plug an external screen to your laptop and you issue
some command to enable&nbsp;it:</p>
<div class="codehilite"><pre><span class="gp">$</span> xrandr --output <span class="caps">VGA</span>-1 --auto --left-of <span class="caps">LVDS</span>-1
</pre></div>


<p><em>awesome</em> detects the change and will restart automatically. Laptops
usually come with a special key to enable/disable an external
screen. Nowadays, this key does nothing unless configured
appropriately. Out of the box, it is mapped to <code>XF86Display</code> symbol. I
have associated this key to a function that will
<a href="https://github.com/vincentbernat/awesome-configuration/blob/master/rc/xrandr.lua">cycle through possible configurations</a> depending on the
plugged screens. For example, if I plug an external screen to my
laptop, I can cycle through the following&nbsp;configurations:</p>
<ul>
<li>only the internal&nbsp;screen,</li>
<li>only the external&nbsp;screen,</li>
<li>internal screen on the left, external screen on the&nbsp;right,</li>
<li>external screen on the left, internal screen on the&nbsp;right,</li>
<li>no&nbsp;change.</li>
</ul>
<p>The proposed configuration is displayed using <em>naughty</em>, the
notification system integrated in <em>awesome</em>.</p>
<p><img alt="Notification of screen reconfiguration" src="//d1g3mdmxf8zbo9.cloudfront.net/images/awesome/xrandr.png" title="Notification displaying the configuration to be applied through XRandR"></p>
<h2 id="widgets">Widgets</h2>
<p>I was previously using <a href="http://conky.sourceforge.net/" title="Conky, a light-weight system monitor">Conky</a> to display various system-related
information, like free space, <span class="caps">CPU</span> usage and network usage. <em>awesome</em>
comes with widgets that can fit the same use. I am relying on
<em>vicious</em>, a contributed widget manager, to manage most of them. It
allows one to attach a function whose task is to fetch values to be
displayed. This is quite&nbsp;powerful.</p>
<p>Here is an example with a volume&nbsp;widget:</p>
<div class="codehilite"><pre><span class="kd">local</span> <span class="n">volwidget</span> <span class="o">=</span> <span class="n">widget</span><span class="p">({</span> <span class="nb">type</span> <span class="o">=</span> <span class="s2">&quot;</span><span class="s">textbox&quot;</span> <span class="p">})</span>
<span class="n">vicious</span><span class="p">.</span><span class="n">register</span><span class="p">(</span><span class="n">volwidget</span><span class="p">,</span> <span class="n">vicious</span><span class="p">.</span><span class="n">widgets</span><span class="p">.</span><span class="n">volume</span><span class="p">,</span>
         <span class="s1">&#39;</span><span class="s">&lt;span font=&quot;Terminus 8&quot;&gt;$2 $1%&lt;/span&gt;&#39;</span><span class="p">,</span>
        <span class="mi">2</span><span class="p">,</span> <span class="s2">&quot;</span><span class="s">Master&quot;</span><span class="p">)</span>
<span class="n">volwidget</span><span class="p">:</span><span class="n">buttons</span><span class="p">(</span><span class="n">awful</span><span class="p">.</span><span class="n">util</span><span class="p">.</span><span class="n">table</span><span class="p">.</span><span class="n">join</span><span class="p">(</span>
             <span class="n">awful</span><span class="p">.</span><span class="n">button</span><span class="p">({</span> <span class="p">},</span> <span class="mi">1</span><span class="p">,</span> <span class="n">volume</span><span class="p">.</span><span class="n">mixer</span><span class="p">),</span>
             <span class="n">awful</span><span class="p">.</span><span class="n">button</span><span class="p">({</span> <span class="p">},</span> <span class="mi">3</span><span class="p">,</span> <span class="n">volume</span><span class="p">.</span><span class="n">toggle</span><span class="p">),</span>
             <span class="n">awful</span><span class="p">.</span><span class="n">button</span><span class="p">({</span> <span class="p">},</span> <span class="mi">4</span><span class="p">,</span> <span class="n">volume</span><span class="p">.</span><span class="n">increase</span><span class="p">),</span>
             <span class="n">awful</span><span class="p">.</span><span class="n">button</span><span class="p">({</span> <span class="p">},</span> <span class="mi">5</span><span class="p">,</span> <span class="n">volume</span><span class="p">.</span><span class="n">decrease</span><span class="p">)))</span>
</pre></div>


<p>You can also use a function to format the text as you wish. For
example, you can display a value in red if it is too low. Have a look
at my <a href="https://github.com/vincentbernat/awesome-configuration/blob/335b80262dd3c3995eb35848ddda6bb47bc742c6/rc/widgets.lua#L24-55" title="Configuration of a battery widget with vicious">battery widget</a> for an&nbsp;example.</p>
<p><img alt="Various widgets" src="//d1g3mdmxf8zbo9.cloudfront.net/images/awesome/widgets.png" title="Various widgets: disk usage, CPU and memory usage, tags and layout, network usage, date and system tray, volume and date"></p>
<h1 id="miscellaneous">Miscellaneous</h1>
<p>While I was working on my <em>awesome</em> configuration, I also changed some
other desktop-related&nbsp;bits.</p>
<h2 id="keyboard-configuration">Keyboard configuration</h2>
<p>I happen to setup all my keyboards to use the <span class="caps">QWERTY</span> layout. I use a
<a href="http://en.wikipedia.org/wiki/Compose_key" title="Compose key on Wikipedia">compose key</a> to input special characters like &#8220;é&#8221;. I have also
recently use <em>Caps Lock</em> as a <em>Control</em> key. All this is perfectly
supported since ages by X11 I am also mapping the <em>Pause</em> key to
<code>XF86ScreenSaver</code> key symbol which will in turn be bound to a function
that will trigger <code>xautolock</code> to lock the&nbsp;screen.</p>
<p>Thanks to a great article about
<a href="http://madduck.net/docs/extending-xkb/" title="Extending the X keyboard map with xkb">extending the X keyboard map with <em>xkb</em></a>, I discovered
that X was able to switch from one layout to another using
groups<sup id="fnref:perwindow"><a href="#fn:perwindow" rel="footnote">2</a></sup>. I finally opted for this simple&nbsp;configuration:</p>
<div class="codehilite"><pre><span class="gp">$</span> setxkbmap us,fr <span class="s1">&#39;&#39;</span> compose:rwin ctrl:nocaps grp:rctrl_rshift_toggle
<span class="gp">$</span> xmodmap -e <span class="s1">&#39;keysym Pause = XF86ScreenSaver&#39;</span>
</pre></div>


<p>I switch from <code>us</code> to <code>fr</code> by pressing both left <em>Control</em> and left
<em>Shift</em>&nbsp;keys.</p>
<h2 id="getting-rid-of-most-gnome-stuff">Getting rid of most <span class="caps">GNOME</span> stuff</h2>
<p>Less than one year ago, to take a step forward to the future, I
started to <a href="/en/blog/2011-gnome-power-manager.html" title="GNOME Power Manager without GNOME desktop">heavily rely on some <span class="caps">GNOME</span> components</a> like <em><span class="caps">GNOME</span>
Display Manager</em>, <em><span class="caps">GNOME</span> Power Manager</em>, the screen saver,
<code>gnome-session</code>, <code>gnome-settings-daemon</code> and others. I had numerous
problems when I tried to setup everything without pulling the whole
<span class="caps">GNOME</span> stack. At each <span class="caps">GNOME</span> update, something was broken: the
screensaver didn&#8217;t start automatically anymore until a full session
restart or some keybindings were randomly hijacked by
<code>gnome-settings-daemon</code>.</p>
<p>Therefore, I have decided to get rid of most of those components. I
have replaced <em><span class="caps">GNOME</span> Power Manager</em> with system-level tools like
<em>sleepd</em> and the <span class="caps">PM</span> utilities. I replaced the <span class="caps">GNOME</span> screensaver with
<a href="http://i3wm.org/i3lock/" title="i3lock: a simpler screen locker">i3lock</a> and <em>xautolock</em>. <em><span class="caps">GDM</span></em> has been replaced by <a href="http://slim.berlios.de/" title="SLiM: Simple Login Manager">SLiM</a> which
now features <em>ConsoleKit</em> support<sup id="fnref:consolekit"><a href="#fn:consolekit" rel="footnote">3</a></sup>. I use <code>~/.gtkrc-2.0</code>
and <code>~/.config/gtk-3.0/settings.ini</code> to configure <abbr title="The GIMP Toolkit"><span class="caps">GTK</span></abbr>+.</p>
<p>The future will&nbsp;wait.</p>
<h2 id="terminal-color-scheme">Terminal color scheme</h2>
<p>I am using <a href="http://software.schmorp.de/pkg/rxvt-unicode.html" title="rxvt-unicode homepage">rxvt-unicode</a> as my terminal with a black background
(and some light transparency). The default color scheme is suboptimal
on the readability&nbsp;front.</p>
<p>Sharing terminal color schemes seems a <a href="https://bbs.archlinux.org/viewtopic.php?id=51818" title="Terminal color scheme thread on Arch Linux forum">popular</a>
<a href="http://xcolors.net/" title="Xcolors.net">activity</a>. I finally opted for the
<a href="http://xcolors.net/dl/derp" title="derp color scheme">&#8220;<em>derp</em>&#8221; color scheme</a> which brings a major improvement over the
default&nbsp;configuration.</p>
<p><img alt="Comparison of terminal color schemes" src="//d1g3mdmxf8zbo9.cloudfront.net/images/awesome/terminal-colors.png" title="Comparison of the default color scheme (bottom) with the new one (top)"></p>
<p>I have also switched to Xft for font rendering using <em>DejaVu Sans
Mono</em> as my default font (instead of <code>fixed</code>) with the following
configuration in <code>~/.Xresources</code>:</p>
<div class="codehilite"><pre><span class="n">Xft</span><span class="p">.</span><span class="n">antialias</span><span class="p">:</span> <span class="n">true</span>
<span class="n">Xft</span><span class="p">.</span><span class="n">hinting</span><span class="p">:</span> <span class="n">true</span>
<span class="n">Xft</span><span class="p">.</span><span class="n">hintstyle</span><span class="p">:</span> <span class="n">hintlight</span>
<span class="n">Xft</span><span class="p">.</span><span class="n">rgba</span><span class="p">:</span> <span class="n">rgb</span>
<span class="n">URxvt</span><span class="p">.</span><span class="n">font</span><span class="p">:</span> <span class="n">xft</span><span class="p">:</span><span class="n">DejaVu</span> <span class="n">Sans</span> <span class="n">Mono</span><span class="o">-</span>8
<span class="n">URxvt</span><span class="p">.</span><span class="n">letterSpace</span><span class="p">:</span> <span class="o">-</span>1
</pre></div>


<p>The result is less crisp but seems a bit more readable. I may switch
back in the&nbsp;future.</p>
<p><img alt="Comparison of terminal fonts" src="//d1g3mdmxf8zbo9.cloudfront.net/images/awesome/terminal-fonts.png" title="Comparison of the fixed font (left) with DejaVu Sans Mono (right)"></p>
<h1 id="next-steps">Next steps</h1>
<p>My reliance to the mouse has been greatly reduced. However, I still
need it for casual browsing. I am looking at <a href="http://mason-larobina.github.com/luakit/" title="luakit: Fast, small, webkit based browser framework extensible by Lua">luakit</a> a
<em>WebKit</em>-based browser extensible with <em>Lua</em> for this&nbsp;purpose.</p>
<div class="footnote">
<hr>
<ol>
<li id="fn:own">
<p>The console gets its own unique name. This allows <em>awesome</em> to
    reliably detect when it is spawned, even on restart. It is how
    the Quake console works in the <em>mod</em> of <abbr title="F? Virtual Window Manager"><abbr title="F? Virtual Window Manager"><abbr title="F? Virtual Window Manager"><span class="caps">FVWM</span></abbr></abbr></abbr> I was using.&#160;<a href="#fnref:own" rev="footnote" title="Jump back to footnote 1 in the text">&#8617;</a></p>
</li>
<li id="fn:perwindow">
<p>However, the layout is global, not per-window. If you
          are interested by a per-window layout, take a look at
          <a href="https://github.com/qnikst/kbdd/" title="kbdd: keyboard library for per-window keyboard layout">kbdd</a>.&#160;<a href="#fnref:perwindow" rev="footnote" title="Jump back to footnote 2 in the text">&#8617;</a></p>
</li>
<li id="fn:consolekit">
<p>Nowadays, you cannot really survive without
           <em>ConsoleKit</em>. Many <em>PolicyKit</em> policies do not rely on
           groups any more to grant access to your devices.&#160;<a href="#fnref:consolekit" rev="footnote" title="Jump back to footnote 3 in the text">&#8617;</a></p>
</li>
</ol>
</div>
]]>
            </content>
        </entry>
                <entry>
            <title type="html">GPG Key Transition Statement 2012</title>
            <author><name>Vincent Bernat</name></author>
            <link href="http://vincent.bernat.im/en/blog/2012-gpg-transition-new-key.html"/>
            <updated>2012-06-16T19:02:11+02:00</updated>
            <id>http://www.luffy.cx/en/blog/2012-gpg-transition-new-key.html</id>

            <content type="html">
<![CDATA[
<p>I am transitioning my <abbr title="GNU Privacy Guard"><span class="caps">GPG</span></abbr> key from an old 1024-bit <abbr title="Digital Signature Algorithm"><span class="caps">DSA</span></abbr> key to a new
4096-bit <abbr title="Rivest Shamir Adleman"><span class="caps">RSA</span></abbr> key.  The old key will continue to be valid for some time
but I prefer all new correspondance to be encrypted with the new
key. I will be making all signatures going forward with the new&nbsp;key.</p>
<p>I have followed the <a href="http://www.debian-administration.org/users/dkg/weblog/48" title="HOWTO prep for migration off of SHA-1 in OpenPGPG">excellent tutorial from Daniel Kahn Gillmor</a>
which also explains why this migration is needed. The only step that I
did not execute is issuing a new certification for keys I have signed
in the past. I did not find any search engine to tell me which key I
have&nbsp;signed.</p>
<p>Here is the signed transition statement (I have stolen it <a href="http://upsilon.cc/~zack/key-transition.2010.txt" title="Stefano Zacchiroli key transition statement">from Zack</a>):</p>
<div class="codehilite"><pre><span class="err">-----<span class="caps">BEGIN</span> <span class="caps">PGP</span> <span class="caps">SIGNED</span> <span class="caps">MESSAGE</span>-----</span>
<span class="err">Hash: <span class="caps">SHA256</span>,<span class="caps">SHA1</span></span>

<span class="err">I am transitioning <span class="caps">GPG</span> keys from an old 1024-bit <span class="caps">DSA</span> key to a new</span>
<span class="err">4096-bit <span class="caps">RSA</span> key.  The old key will continue to be valid for some</span>
<span class="err">time, but I prefer all new correspondance to be encrypted in the new</span>
<span class="err">key, and will be making all signatures going forward with the new key.</span>

<span class="err">This transition document is signed with both keys to validate the</span>
<span class="err">transition.</span>

<span class="err">If you have signed my old key, I would appreciate signatures on my new</span>
<span class="err">key as well, provided that your signing policy permits that without</span>
<span class="err">reauthenticating me.</span>

<span class="err">The old key, which I am transitional away from, is:</span>

<span class="err">  pub   1024D/<span class="caps">F22A794E</span> 2001-03-23</span>
<span class="err">      Key fingerprint = 5854 <span class="caps">AF2B</span> 65B2 0E96 2161  <span class="caps">E32B</span> 285B <span class="caps">D7A1</span> <span class="caps">F22A</span> 794E</span>

<span class="err">The new key, to which I am transitioning, is:</span>

<span class="err">  pub   4096R/353525F9 2012-06-16 [expires: 2014-06-16]</span>
<span class="err">      Key fingerprint = <span class="caps">AEF2</span> 3487 66F3 71C6 89A7  3600 95A4 <span class="caps">2FE8</span> 3535 25F9</span>

<span class="err">To fetch the full new key from a public key server using GnuPG, run:</span>

<span class="err">  gpg --keyserver keys.gnupg.net --recv-key <span class="caps">95A42FE8353525F9</span></span>

<span class="err">If you have already validated my old key, you can then validate that</span>
<span class="err">the new key is signed by my old key:</span>

<span class="err">  gpg --check-sigs <span class="caps">95A42FE8353525F9</span></span>

<span class="err">If you then want to sign my new key, a simple and safe way to do that</span>
<span class="err">is by using caff (shipped in Debian as part of the &quot;signing-party&quot;</span>
<span class="err">package) as follows:</span>

<span class="err">  caff <span class="caps">95A42FE8353525F9</span></span>

<span class="err">Please contact me via e-mail at &lt;vincent@bernat.im&gt; if you have any</span>
<span class="err">questions about this document or this transition.</span>

<span class="err">  Vincent Bernat</span>
<span class="err">  vincent@bernat.im</span>
<span class="err">  16-06-2012</span>
<span class="err">-----<span class="caps">BEGIN</span> <span class="caps">PGP</span> <span class="caps">SIGNATURE</span>-----</span>
<span class="err">Version: GnuPG v1.4.12 (<span class="caps">GNU</span>/Linux)</span>

<span class="err">iQIcBAEBCAAGBQJP3LchAAoJEJWkL+g1NSX5fV0P/iEjcLp7EOky/AVkbsHxiV30</span>
<span class="err">KId7aYmcZRLJpvLZPz0xxThZq2MTVhX+SdiPcrSTa8avY8Kay6gWjEK0FtB+72du</span>
<span class="err">3RxhVYDqEQtrhUmIY2jOVyw9c0vMJh4189J+8iJ5HGQo9SjFEuRrP9xxNTv3OQD5</span>
<span class="err">fRTMUBMC3q1/KcuhPA8ULp4L1OS0xTksRfvs6852XDfSJIZhsYxYODWpWqLsGEcu</span>
<span class="err">DhQ7KHtbOUwjwsoiURGnjwdiFpbb6/9cwXeD3/GAY9uNHxac6Ufi4J64bealuPXi</span>
<span class="err">O4GgG9cEreBTkPrUsyrHtCYzg43X0q4B7TSDg27j0xm+xd+jW/d/0AlBHPXcXemc</span>
<span class="err">b+pw09qLOwQWbsd6d4bx22VXI75btSFs8HwR9hKHBeOAagMHz+AVl5pLXo2rYoiH</span>
<span class="err">34fR1HWqyRdT3bCt19Ys1N+d0fznsZNFOMC+l23QyptOoMz7t7vZ6GbB20ExafrW</span>
<span class="err">+gi7r1sV/6tb9sYMcVV2S3XT003Uwg8PXajyOnFHxPsMoX9zsk1ejo3lxkkTZs0H</span>
<span class="err">yLZtUj3iZ3yX9e2yfv3eOxitR4+bIntEbMecnTI9xJn+33QTz/pWBqg9uDosqzUo</span>
<span class="err">UoQtc6WVn9x3Zsi7aneDYcp06ZdphgsyWhgiLIhQG9MAK9wKthKiZv8DqGYDOsKt</span>
<span class="err">WwpQFvns33e5x4SM4KxXiEYEARECAAYFAk/ctyEACgkQKFvXofIqeU5YLwCdFhEL</span>
<span class="err">P7vpUJA2zv9+dpPN5GLfBlcAn0mDGJcjJpYZl/+aXEnP/8cE0day</span>
<span class="err">=0QnC</span>
<span class="err">-----<span class="caps">END</span> <span class="caps">PGP</span> <span class="caps">SIGNATURE</span>-----</span>
</pre></div>


<p>For easier access, I have also published it in text format. You can
check it&nbsp;with:</p>
<div class="codehilite"><pre><span class="gp">$</span> gpg --keyserver keys.gnupg.net --recv-key <span class="caps">95A42FE8353525F9</span>
<span class="go">gpg: requesting key 353525F9 from hkp server keys.gnupg.net</span>
<span class="go">gpg: key 353525F9: &quot;Vincent Bernat &lt;bernat@luffy.cx&gt;&quot; not changed</span>
<span class="go">gpg: Total number processed: 1</span>
<span class="go">gpg:              unchanged: 1</span>
<span class="gp">$</span> curl http://vincent.bernat.im/media/files/key-transition-2012.txt | <span class="se">\</span>
<span class="gp">&gt;</span>       gpg --verify
</pre></div>


<p>To avoid signing/encrypting with the old key who share the same email
addresses than the new one, I have saved it, removed it from the
keyring and added it again. The new key is now first in both the
secret and the public keyrings and will be used whenever the
appropriate email address is&nbsp;requested.</p>
<div class="codehilite"><pre><span class="gp">$</span> gpg --export-secret-keys <span class="caps">F22A794E</span> &gt; ~/tmp/secret
<span class="gp">$</span> gpg --export <span class="caps">F22A794E</span> &gt; ~/tmp/public
<span class="gp">$</span> gpg --delete-secret-key <span class="caps">F22A794</span>
<span class="go">sec  1024D/<span class="caps">F22A794E</span> 2001-03-23 Vincent Bernat &lt;bernat@luffy.cx&gt;</span>

<span class="go">Delete this key from the keyring? (y/N) y</span>
<span class="go">This is a secret key! - really delete? (y/N) y</span>
<span class="gp">$</span> gpg --delete-key <span class="caps">F22A794E</span>
<span class="go">pub  1024D/<span class="caps">F22A794E</span> 2001-03-23 Vincent Bernat &lt;bernat@luffy.cx&gt;</span>

<span class="go">Delete this key from the keyring? (y/N) y</span>
<span class="gp">$</span> gpg --import ~/tmp/public
<span class="go">gpg: key <span class="caps">F22A794E</span>: public key &quot;Vincent Bernat &lt;bernat@luffy.cx&gt;&quot; imported</span>
<span class="go">gpg: Total number processed: 1</span>
<span class="go">gpg:               imported: 1</span>
<span class="go">gpg: 3 marginal(s) needed, 1 complete(s) needed, classic trust model</span>
<span class="go">gpg: depth: 0  valid:   2  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 2u</span>
<span class="go">gpg: next trustdb check due at 2014-06-16</span>
<span class="gp">$</span> gpg --import ~/tmp/secret
<span class="go">gpg: key <span class="caps">F22A794E</span>: secret key imported</span>
<span class="go">gpg: key <span class="caps">F22A794E</span>: &quot;Vincent Bernat &lt;bernat@luffy.cx&gt;&quot; not changed</span>
<span class="go">gpg: Total number processed: 1</span>
<span class="go">gpg:              unchanged: 1</span>
<span class="go">gpg:       secret keys read: 1</span>
<span class="go">gpg:   secret keys imported: 1</span>
<span class="gp">$</span> rm ~/tmp/public ~/tmp/secret
<span class="gp">$</span> gpg --edit-key <span class="caps">F22A794E</span>
<span class="go">[...]</span>
<span class="go">gpg&gt; trust</span>
<span class="go">[...]</span>
<span class="go">Please decide how far you trust this user to correctly verify other users&#39; keys</span>
<span class="go">(by looking at passports, checking fingerprints from different sources, etc.)</span>

<span class="go">  1 = I don&#39;t know or won&#39;t say</span>
<span class="go">  2 = I do <span class="caps">NOT</span> trust</span>
<span class="go">  3 = I trust marginally</span>
<span class="go">  4 = I trust fully</span>
<span class="go">  5 = I trust ultimately</span>
<span class="go">  m = back to the main menu</span>

<span class="go">Your decision? 5</span>
<span class="go">Do you really want to set this key to ultimate trust? (y/N) y</span>
</pre></div>


<p>I now need to gather some signatures for the new key. If this is
appropriate for you, please sign the new key if you signed the old&nbsp;one.</p>
]]>
            </content>
        </entry>
                <entry>
            <title type="html">Integration of NetSNMP into an event loop</title>
            <author><name>Vincent Bernat</name></author>
            <link href="http://vincent.bernat.im/en/blog/2012-snmp-event-loop.html"/>
            <updated>2012-05-28T20:54:50+02:00</updated>
            <id>http://www.luffy.cx/en/blog/2012-snmp-event-loop.html</id>

            <content type="html">
<![CDATA[
<p><em>NetSNMP</em> comes with its own <a href="http://en.wikipedia.org/wiki/Event_loop" title="Event loop on Wikipedia">event loop</a> based on the <code>select()</code>
system call. While you can build your program around it, you may
prefer to use an existing event loop, either a custom one or something
like <a href="http://libevent.org/" title="libevent – an event notification library">libevent</a>, <a href="http://software.schmorp.de/pkg/libev.html" title="libev – a full-featured and high-performance event loop">libev</a> or <a href="http://twistedmatrix.com" title="an event-driven networking engine written in Python">Twisted</a>.</p>
<div class="toc">
<ul>
<li><a href="#own-custom-event-loop">Own custom event loop</a></li>
<li><a href="#third-party-event-loop">Third-party event loop</a><ul>
<li><a href="#libevent">libevent</a></li>
<li><a href="#twisted-reactor">Twisted reactor</a></li>
</ul>
</li>
<li><a href="#miscellaneous">Miscellaneous</a><ul>
<li><a href="#lack-of-asynchronicity">Lack of asynchronicity</a></li>
<li><a href="#limitation-of-file-descriptor-sets">Limitation of file descriptor sets</a></li>
<li><a href="#threads">Threads</a></li>
</ul>
</li>
</ul>
</div>
<h1 id="own-custom-event-loop">Own custom event loop</h1>
<p>Let&#8217;s start with the easiest case: you have written your own event
loop. This means you make a call to <code>select()</code>, <code>poll()</code>, <code>epoll()</code>,
<code>kqueue()</code> or something similar. All those functions take a set of
file descriptors and wait for any of them to become available in a
specified time&nbsp;frame.</p>
<p>Here is a typical use of <code>select()</code> (adapted from <a href="http://www.nongnu.org/quagga/" title="Quagga Routing Software Suite">Quagga</a>)</p>
<div class="codehilite"><pre><span class="n">readfd</span> <span class="o">=</span> <span class="n">m</span><span class="o">-&gt;</span><span class="n">readfd</span><span class="p">;</span>
<span class="n">writefd</span> <span class="o">=</span> <span class="n">m</span><span class="o">-&gt;</span><span class="n">writefd</span><span class="p">;</span>
<span class="n">exceptfd</span> <span class="o">=</span> <span class="n">m</span><span class="o">-&gt;</span><span class="n">exceptfd</span><span class="p">;</span>
<span class="n">timer_wait</span> <span class="o">=</span> <span class="n">thread_timer_wait</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">m</span><span class="o">-&gt;</span><span class="n">timer</span><span class="p">);</span>
<span class="n">num</span> <span class="o">=</span> <span class="n">select</span> <span class="p">(</span><span class="n">FD_SETSIZE</span><span class="p">,</span>
              <span class="o">&amp;</span><span class="n">readfd</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">writefd</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">exceptfd</span><span class="p">,</span>
              <span class="n">timer_wait</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">num</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span>
  <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">errno</span> <span class="o">==</span> <span class="n"><span class="caps">EINTR</span></span><span class="p">)</span>
      <span class="k">continue</span><span class="p">;</span> <span class="cm">/* signal received - process it */</span>
    <span class="n">zlog_warn</span> <span class="p">(</span><span class="s">&quot;select() error: %s&quot;</span><span class="p">,</span>
               <span class="n">safe_strerror</span> <span class="p">(</span><span class="n">errno</span><span class="p">));</span>
    <span class="k">return</span> <span class="nb"><span class="caps">NULL</span></span><span class="p">;</span>
  <span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">num</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
  <span class="p">{</span>
    <span class="cm">/* Timeout handling */</span>
    <span class="n">thread_timer_process</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">m</span><span class="o">-&gt;</span><span class="n">timer</span><span class="p">);</span>
  <span class="p">}</span>
<span class="k">if</span> <span class="p">(</span><span class="n">num</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span>
  <span class="p">{</span>
    <span class="n">thread_process_fd</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">readfd</span><span class="p">);</span>
    <span class="n">thread_process_fd</span> <span class="p">(</span><span class="o">&amp;</span><span class="n">writefd</span><span class="p">);</span>
  <span class="p">}</span>
</pre></div>


<p><code>thread_process_fd (fds)</code> function iterates on each file descriptor
<code>fd</code> and executes the appropriate action if <code>FD_ISSET (fds, fd)</code> is&nbsp;true.</p>
<p>Integrating <em>NetSNMP</em> into such an event loop is easy: <em>NetSNMP</em>
provides the <code>snmp_select_info()</code> function which alters a set of file
descriptors to insert its own.  Here is the new&nbsp;code:</p>
<div class="codehilite"><pre><span class="cp">#if defined HAVE_SNMP</span>
<span class="k">struct</span> <span class="n">timeval</span> <span class="n">snmp_timer_wait</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">snmpblock</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">fdsetsize</span><span class="p">;</span>
<span class="cp">#endif</span>

<span class="cm">/* ... */</span>

<span class="cp">#if defined HAVE_SNMP</span>
<span class="n">fdsetsize</span> <span class="o">=</span> <span class="n">FD_SETSIZE</span><span class="p">;</span>
<span class="n">snmpblock</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">timer_wait</span><span class="p">)</span>
  <span class="p">{</span>
    <span class="n">snmpblock</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="n">memcpy</span><span class="p">(</span><span class="o">&amp;</span><span class="n">snmp_timer_wait</span><span class="p">,</span> <span class="n">timer_wait</span><span class="p">,</span>
           <span class="k">sizeof</span><span class="p">(</span><span class="k">struct</span> <span class="n">timeval</span><span class="p">));</span>
  <span class="p">}</span>
<span class="n">snmp_select_info</span><span class="p">(</span><span class="o">&amp;</span><span class="n">fdsetsize</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">readfd</span><span class="p">,</span>
                 <span class="o">&amp;</span><span class="n">snmp_timer_wait</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">snmpblock</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">snmpblock</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
  <span class="n">timer_wait</span> <span class="o">=</span> <span class="o">&amp;</span><span class="n">snmp_timer_wait</span><span class="p">;</span>
<span class="cp">#endif</span>

<span class="n">num</span> <span class="o">=</span> <span class="n">select</span> <span class="p">(</span><span class="n">FD_SETSIZE</span><span class="p">,</span>
              <span class="o">&amp;</span><span class="n">readfd</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">writefd</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">exceptfd</span><span class="p">,</span>
              <span class="n">timer_wait</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">num</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>

<span class="cp">#if defined HAVE_SNMP</span>
<span class="k">if</span> <span class="p">(</span><span class="n">num</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">)</span>
  <span class="n">snmp_read</span><span class="p">(</span><span class="o">&amp;</span><span class="n">readfd</span><span class="p">);</span>
<span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">num</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
  <span class="p">{</span>
    <span class="n">snmp_timeout</span><span class="p">();</span>
    <span class="n">run_alarms</span><span class="p">();</span>
  <span class="p">}</span>
<span class="n">netsnmp_check_outstanding_agent_requests</span><span class="p">();</span>
<span class="cp">#endif</span>

<span class="cm">/* ... */</span>
</pre></div>


<p><code>snmp_select_info()</code> may modify the provided set of file descriptors
(and therefore, its size). It may also modify the provided timer in
case it needs to schedule an action before the original&nbsp;timeout.</p>
<p><code>snmpblock</code> is a tricky variable. While <code>select()</code> can be called with
a timeout set to <code>NULL</code>, this is not the case of <code>snmp_select_info()</code>:
you have to pass a valid pointer. <code>snmpblock</code> is set to 0 if the
provided timeout must be considered or to 1
otherwise. <code>snmp_select_info()</code> will set <code>snmpblock</code> to 0 <em>if and only
if</em> it alters the&nbsp;timeout.</p>
<p>Here are a two examples of integration. Both are for a subagent but
the code for a manager is the&nbsp;same:</p>
<ul>
<li><a href="https://github.com/vincentbernat/quagga/blob/201abe248e0736e74e7333fd33a954b7cac91a9a/lib/thread.c#L1039-L1166">Quagga</a></li>
<li><a href="https://github.com/vincentbernat/keepalived/blob/bfb75bc7291646040f6cc5ec677c6e4292c2b0e4/lib/scheduler.c#L501-L701">Keepalived</a></li>
</ul>
<h1 id="third-party-event-loop">Third-party event loop</h1>
<p>With a third party event loop, you don&#8217;t have access to the <code>select()</code>
system call anymore. Therefore, you cannot alter it with
<code>snmp_select_info()</code>. Instead, at each iteration of the loop, a list
of <span class="caps">SNMP</span> related file descriptors is kept up-to-date with the help of
<code>snmp_select_info()</code>: new ones are added and old ones are&nbsp;removed.</p>
<h2 id="libevent">libevent</h2>
<p>Let&#8217;s assume that <code>snmp_fds</code> is the list of current <span class="caps">SNMP</span> related
events<sup id="fnref:base"><a href="#fn:base" rel="footnote">1</a></sup>. A function <code>levent_snmp_update()</code> calls
<code>snmp_select_info()</code> to update this list. Here is a partial
implementation (error handling removed and some declarations omitted,
look at the
<a href="https://github.com/vincentbernat/lldpd/blob/5fd6695c090ddecb77e8324b6d6fb8b8fe43860a/src/event.c#L51-L188">complete version</a>
from <a href="http://vincentbernat.github.com/lldpd" title="lldpd — a 802.1ab implementation">lldpd</a>):</p>
<div class="codehilite"><pre><span class="k">static</span> <span class="kt">void</span>
<span class="nf">levent_snmp_update</span><span class="p">()</span>
<span class="p">{</span>
    <span class="kt">int</span> <span class="n">maxfd</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">block</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>

    <span class="n">FD_ZERO</span><span class="p">(</span><span class="o">&amp;</span><span class="n">fdset</span><span class="p">);</span>
    <span class="n">snmp_select_info</span><span class="p">(</span><span class="o">&amp;</span><span class="n">maxfd</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fdset</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">timeout</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">block</span><span class="p">);</span>

    <span class="cm">/* We need to untrack any event whose <span class="caps">FD</span> is not in `fdset`</span>
<span class="cm">       anymore */</span>
    <span class="k">for</span> <span class="p">(</span><span class="n">snmpfd</span> <span class="o">=</span> <span class="n">TAILQ_FIRST</span><span class="p">(</span><span class="n">snmp_fds</span><span class="p">);</span>
         <span class="n">snmpfd</span><span class="p">;</span>
         <span class="n">snmpfd</span> <span class="o">=</span> <span class="n">snmpfd_next</span><span class="p">)</span> <span class="p">{</span>
        <span class="n">snmpfd_next</span> <span class="o">=</span> <span class="n">TAILQ_NEXT</span><span class="p">(</span><span class="n">snmpfd</span><span class="p">,</span> <span class="n">next</span><span class="p">);</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">event_get_fd</span><span class="p">(</span><span class="n">snmpfd</span><span class="o">-&gt;</span><span class="n">ev</span><span class="p">)</span> <span class="o">&gt;=</span> <span class="n">maxfd</span> <span class="o">||</span>
            <span class="p">(</span><span class="o">!</span><span class="n">FD_ISSET</span><span class="p">(</span><span class="n">event_get_fd</span><span class="p">(</span><span class="n">snmpfd</span><span class="o">-&gt;</span><span class="n">ev</span><span class="p">),</span> <span class="o">&amp;</span><span class="n">fdset</span><span class="p">)))</span> <span class="p">{</span>
            <span class="n">event_free</span><span class="p">(</span><span class="n">snmpfd</span><span class="o">-&gt;</span><span class="n">ev</span><span class="p">);</span>
            <span class="n">TAILQ_REMOVE</span><span class="p">(</span><span class="n">snmp_fds</span><span class="p">,</span> <span class="n">snmpfd</span><span class="p">,</span> <span class="n">next</span><span class="p">);</span>
            <span class="n">free</span><span class="p">(</span><span class="n">snmpfd</span><span class="p">);</span>
        <span class="p">}</span> <span class="k">else</span>
            <span class="n">FD_CLR</span><span class="p">(</span><span class="n">event_get_fd</span><span class="p">(</span><span class="n">snmpfd</span><span class="o">-&gt;</span><span class="n">ev</span><span class="p">),</span> <span class="o">&amp;</span><span class="n">fdset</span><span class="p">);</span>
    <span class="p">}</span>

    <span class="cm">/* Invariant: <span class="caps">FD</span> in `fdset` are not in list of <span class="caps">FD</span> */</span>
    <span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">fd</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">fd</span> <span class="o">&lt;</span> <span class="n">maxfd</span><span class="p">;</span> <span class="n">fd</span><span class="o">++</span><span class="p">)</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">FD_ISSET</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fdset</span><span class="p">))</span> <span class="p">{</span>
            <span class="n">snmpfd</span><span class="o">-&gt;</span><span class="n">ev</span> <span class="o">=</span> <span class="n">event_new</span><span class="p">(</span><span class="n">base</span><span class="p">,</span> <span class="n">fd</span><span class="p">,</span>
                <span class="n">EV_READ</span> <span class="o">|</span> <span class="n">EV_PERSIST</span><span class="p">,</span>
                <span class="n">levent_snmp_read</span><span class="p">,</span>
                <span class="nb"><span class="caps">NULL</span></span><span class="p">);</span>
            <span class="n">event_add</span><span class="p">(</span><span class="n">snmpfd</span><span class="o">-&gt;</span><span class="n">ev</span><span class="p">,</span> <span class="nb"><span class="caps">NULL</span></span><span class="p">);</span>
            <span class="n">TAILQ_INSERT_TAIL</span><span class="p">(</span><span class="n">snmp_fds</span><span class="p">,</span> <span class="n">snmpfd</span><span class="p">,</span> <span class="n">next</span><span class="p">);</span>
        <span class="p">}</span>

    <span class="cm">/* If needed, handle timeout */</span>
    <span class="n">evtimer_add</span><span class="p">(</span><span class="n">snmp_timeout</span><span class="p">,</span> <span class="n">block</span><span class="o">?</span><span class="nb"><span class="caps">NULL</span></span><span class="o">:&amp;</span><span class="n">timeout</span><span class="p">);</span>
<span class="p">}</span>
</pre></div>


<p>Then, replace the main loop (usually, a call to
<code>event_base_dispatch()</code>) with&nbsp;this:</p>
<div class="codehilite"><pre><span class="k">do</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="n">event_base_got_break</span><span class="p">(</span><span class="n">base</span><span class="p">)</span> <span class="o">||</span>
        <span class="n">event_base_got_exit</span><span class="p">(</span><span class="n">base</span><span class="p">))</span>
        <span class="k">break</span><span class="p">;</span>
    <span class="n">netsnmp_check_outstanding_agent_requests</span><span class="p">();</span>
    <span class="n">levent_snmp_update</span><span class="p">();</span>
<span class="p">}</span> <span class="k">while</span> <span class="p">(</span><span class="n">event_base_loop</span><span class="p">(</span><span class="n">base</span><span class="p">,</span> <span class="n">EVLOOP_ONCE</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">);</span>
</pre></div>


<p>Here is how are defined the two&nbsp;callbacks:</p>
<div class="codehilite"><pre><span class="k">static</span> <span class="kt">void</span>
<span class="nf">levent_snmp_read</span><span class="p">(</span><span class="n">evutil_socket_t</span> <span class="n">fd</span><span class="p">,</span> <span class="kt">short</span> <span class="n">what</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
    <span class="n">FD_ZERO</span><span class="p">(</span><span class="o">&amp;</span><span class="n">fdset</span><span class="p">);</span>
    <span class="n">FD_SET</span><span class="p">(</span><span class="n">fd</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">fdset</span><span class="p">);</span>
    <span class="n">snmp_read</span><span class="p">(</span><span class="o">&amp;</span><span class="n">fdset</span><span class="p">);</span>
    <span class="n">levent_snmp_update</span><span class="p">();</span>
<span class="p">}</span>

<span class="k">static</span> <span class="kt">void</span>
<span class="nf">levent_snmp_timeout</span><span class="p">(</span><span class="n">evutil_socket_t</span> <span class="n">fd</span><span class="p">,</span> <span class="kt">short</span> <span class="n">what</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
    <span class="n">snmp_timeout</span><span class="p">();</span>
    <span class="n">run_alarms</span><span class="p">();</span>
    <span class="n">levent_snmp_update</span><span class="p">();</span>
<span class="p">}</span>
</pre></div>


<h2 id="twisted-reactor">Twisted reactor</h2>
<p>As a second example, here is how to integrate <em>NetSNMP</em> into
<a href="http://twistedmatrix.com" title="an event-driven networking engine written in Python">Twisted</a> reactor. <em>Twisted</em> is an event-driven network programming
framework written in Python. It does not come with support for
<span class="caps">SNMP</span><sup id="fnref:twistedsnmp"><a href="#fn:twistedsnmp" rel="footnote">2</a></sup>. Because of the language mismatch, the integration
of <em>NetSNMP</em> in <em>Twisted</em> needs to be done using a C extension or the
<code>ctypes</code> module.</p>
<p><em>Twisted</em> event loop is called a reactor. Like for <em>libevent</em>, events
need to be registered. It is possible to register file descriptor-like
objects using a class implementing a handful of methods. Here is the
implementation of such a class (adapted from <em>PyNetSNMP</em>, a subproject
of <a href="http://www.zenoss.com/" title="Open-source application, server, and network management platform">Zenoss</a> using the <code>ctypes</code> module):</p>
<div class="codehilite"><pre><span class="k">class</span> <span class="nc">SnmpReader</span><span class="p">:</span>
    <span class="s">&quot;Respond to input events&quot;</span>
    <span class="n">implements</span><span class="p">(</span><span class="n">IReadDescriptor</span><span class="p">)</span>

    <span class="k">def</span> <span class="nf">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">fd</span><span class="p">):</span>
        <span class="bp">self</span><span class="o">.</span><span class="n">fd</span> <span class="o">=</span> <span class="n">fd</span>

    <span class="k">def</span> <span class="nf">doRead</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="n">netsnmp</span><span class="o">.</span><span class="n">snmp_read</span><span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">fd</span><span class="p">)</span>

    <span class="k">def</span> <span class="nf">fileno</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
        <span class="k">return</span> <span class="bp">self</span><span class="o">.</span><span class="n">fd</span>
</pre></div>


<p>Like for <em>libevent</em>, we have a function to update the list of <span class="caps">SNMP</span>
related events (stored as a mapping between file descriptors and
<code>SnmpReader</code> instances):</p>
<div class="codehilite"><pre><span class="k">class</span> <span class="nc">Timer</span><span class="p">(</span><span class="nb">object</span><span class="p">):</span>
    <span class="n">callLater</span> <span class="o">=</span> <span class="bp">None</span>
<span class="n">timer</span> <span class="o">=</span> <span class="n">Timer</span><span class="p">()</span>
<span class="n">fdMap</span> <span class="o">=</span> <span class="p">{}</span>

<span class="k">def</span> <span class="nf">updateReactor</span><span class="p">():</span>
    <span class="s">&quot;Add/remove event handlers for <span class="caps">SNMP</span> file descriptors and timers&quot;</span>

    <span class="n">fds</span><span class="p">,</span> <span class="n">t</span> <span class="o">=</span> <span class="n">netsnmp</span><span class="o">.</span><span class="n">snmp_select_info</span><span class="p">()</span>
    <span class="k">for</span> <span class="n">fd</span> <span class="ow">in</span> <span class="n">fds</span><span class="p">:</span>
        <span class="k">if</span> <span class="n">fd</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">fdMap</span><span class="p">:</span>
            <span class="n">reader</span> <span class="o">=</span> <span class="n">SnmpReader</span><span class="p">(</span><span class="n">fd</span><span class="p">)</span>
            <span class="n">fdMap</span><span class="p">[</span><span class="n">fd</span><span class="p">]</span> <span class="o">=</span> <span class="n">reader</span>
            <span class="n">reactor</span><span class="o">.</span><span class="n">addReader</span><span class="p">(</span><span class="n">reader</span><span class="p">)</span>
    <span class="n">current</span> <span class="o">=</span> <span class="n">Set</span><span class="p">(</span><span class="n">fdMap</span><span class="o">.</span><span class="n">keys</span><span class="p">())</span>
    <span class="n">need</span> <span class="o">=</span> <span class="n">Set</span><span class="p">(</span><span class="n">fds</span><span class="p">)</span>
    <span class="n">doomed</span> <span class="o">=</span> <span class="n">current</span> <span class="o">-</span> <span class="n">need</span>
    <span class="k">for</span> <span class="n">d</span> <span class="ow">in</span> <span class="n">doomed</span><span class="p">:</span>
        <span class="n">reactor</span><span class="o">.</span><span class="n">removeReader</span><span class="p">(</span><span class="n">fdMap</span><span class="p">[</span><span class="n">d</span><span class="p">])</span>
        <span class="k">del</span> <span class="n">fdMap</span><span class="p">[</span><span class="n">d</span><span class="p">]</span>
    <span class="k">if</span> <span class="n">timer</span><span class="o">.</span><span class="n">callLater</span><span class="p">:</span>
        <span class="n">timer</span><span class="o">.</span><span class="n">callLater</span><span class="o">.</span><span class="n">cancel</span><span class="p">()</span>
        <span class="n">timer</span><span class="o">.</span><span class="n">callLater</span> <span class="o">=</span> <span class="bp">None</span>
    <span class="k">if</span> <span class="n">t</span> <span class="ow">is</span> <span class="ow">not</span> <span class="bp">None</span><span class="p">:</span>
        <span class="n">timer</span><span class="o">.</span><span class="n">callLater</span> <span class="o">=</span> <span class="n">reactor</span><span class="o">.</span><span class="n">callLater</span><span class="p">(</span><span class="n">t</span><span class="p">,</span> <span class="n">checkTimeouts</span><span class="p">)</span>
</pre></div>


<p>Contrary to <em>libevent</em>, we cannot alter the main loop to call
<code>updateReactor()</code> at each iteration. Therefore, it <em>must</em> be called
after each <span class="caps">SNMP</span> related&nbsp;function.</p>
<p>For another example, have a look at
<a href="https://github.com/vincentbernat/QCss-3/blob/ceb6520305fc7307891173b7bc211f06bcc85ed1/qcss3/collector/snmp.c#L75-L174">my equivalent implementation as a C extension</a>.</p>
<h1 id="miscellaneous">Miscellaneous</h1>
<h2 id="lack-of-asynchronicity">Lack of asynchronicity</h2>
<p>Integrating <em>NetSNMP</em> into an event-based program is not risk-free. I
have written a fairly comprehensive article on the
<a href="/en/blog/2012-fixing-async-agentx.html" title="Asynchronicity &amp; Net-SNMP AgentX protocol">lack of asynchronicity in <em>NetSNMP</em> AgentX protocol implementation</a>. I
urge you to read it if you want to integrate a <span class="caps">SNMP</span> subagent into an
existing&nbsp;program.</p>
<p>On the <em>manager side</em>, you will get similar drawbacks when using
SNMPv3. To retrieve or manipulate management information using SNMPv3,
it is necessary to know the identifier of the remote <span class="caps">SNMP</span> protocol
engine. This identifier can be configured directly on each manager but
it is usually discovered by querying
<code>SNMP-FRAMEWORK-MIB::‌snmpEngineID</code>, as described in <a href="http://tools.ietf.org/html/rfc5343" title="RFC 5343: SNMP Context EngineID Discovery"><span class="caps">RFC</span> 5343</a>.</p>
<p>Unfortunately, this discovery is done <strong>synchronously</strong> by
<em>NetSNMP</em>. There is a <a href="https://sourceforge.net/tracker/?func=detail&amp;aid=3446148&amp;group_id=12694&amp;atid=112694" title="snmp_async_send() blocks during SNMPv3 probe">bug report</a> for this issue but it
seems difficult to fix. I have not been able to come up with a proper
patch but I have described a <a href="http://www.mail-archive.com/net-snmp-coders@lists.sourceforge.net/msg18746.html" title="Workaround for snmp_async_send() and SNMPv3">workaround</a> around
<code>snmp_sess_async_send()</code>.</p>
<p>There is no such problem with&nbsp;SNMPv2.</p>
<h2 id="limitation-of-file-descriptor-sets">Limitation of file descriptor sets</h2>
<p><code>snmp_select_info()</code> and <code>snmp_read()</code> uses the <code>fd_set</code> type to
handle a set of file descriptors. This type should be manipulated with
<code>FD_CLR()</code>, <code>FD_ISSET()</code>, <code>FD_SET()</code> and <code>FD_ZERO()</code>. Those may be
defined as functions but they usually are macros. Moreover, no file
descriptor greater than <code>FD_SETSIZE</code> (which is usually set to 1024)
can be handled with the <code>fd_set</code> type. This means that if you have a
file descriptor greater than 1024, you won&#8217;t be able to use
<code>snmp_select_info()</code> with&nbsp;it.</p>
<p>Starting from <em>NetSNMP</em> 5.5, you can use <code>snmp_select_info2()</code> and
<code>snmp_read2()</code> instead of <code>snmp_select_info()</code> and <code>snmp_read()</code>. They
use the <code>netsnmp_large_fd_set</code>.</p>
<p>If you want to keep compatibility with <em>NetSNMP</em> 5.4, here is another
twist. The <code>fd_set</code> type is usually a fixed-size array of long
integers. <code>FD_CLR()</code>, <code>FD_ISSET()</code> and <code>FD_SET()</code> are
size-independant. The size only matters for <code>FD_ZERO()</code> which is not
used by either <code>snmp_select_info()</code> or <code>snmp_read()</code>. You can
therefore allocate a larger <code>fd_set</code>. A common way is to compile your
program with <code>-D__FD_SETSIZE=4096</code>. You can still use <code>FD_ZERO()</code>
yourself. However, this is not portable (<span class="caps">GNU</span> C Library only). Another
way is to allocate your own array of long integers and cast it to
<code>fd_set</code>. You&#8217;ll have to redefine <code>FD_ZERO()</code>:</p>
<div class="codehilite"><pre><span class="k">typedef</span> <span class="k">struct</span> <span class="p">{</span>
   <span class="kt">long</span> <span class="kt">int</span> <span class="n">fds_bits</span><span class="p">[</span><span class="mi">4096</span><span class="o">/</span><span class="k">sizeof</span><span class="p">(</span><span class="kt">long</span> <span class="kt">int</span><span class="p">)];</span>
<span class="p">}</span> <span class="n">my_fdset</span><span class="p">;</span>

<span class="cp">#undef FD_ZERO</span>
<span class="cp">#define FD_ZERO(fdset) memset((fdset), 0, sizeof(my_fdset))</span>

<span class="p">{</span>
   <span class="cm">/* ... */</span>
   <span class="n">my_fdset</span> <span class="n">fdset</span><span class="p">;</span>
   <span class="n">FD_ZERO</span><span class="p">(</span><span class="o">&amp;</span><span class="n">fdset</span><span class="p">);</span>
   <span class="cm">/* ... */</span>
   <span class="n">rc</span> <span class="o">=</span> <span class="n">snmp_select_info</span><span class="p">(</span><span class="o">&amp;</span><span class="n">n</span><span class="p">,</span> <span class="p">(</span><span class="n">fd_set</span> <span class="o">*</span><span class="p">)</span><span class="o">&amp;</span><span class="n">fdset</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">timeout</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">block</span><span class="p">);</span>
   <span class="cm">/* ... */</span>
<span class="p">}</span>
</pre></div>


<h2 id="threads">Threads</h2>
<p><em>NetSNMP</em> provides two&nbsp;<span class="caps">API</span>:</p>
<ul>
<li>the traditional session <span class="caps">API</span> which is not thread-safe and</li>
<li>the single session&nbsp;<span class="caps">API</span>.</li>
</ul>
<p>In the above examples, I have used the traditional session <span class="caps">API</span>. If you
are using threads, you need to ensure that <em>all</em> <span class="caps">SNMP</span> operations are
done in a single thread. Otherwise, you need to adapt the examples to
use the single session <span class="caps">API</span>. <code>snmp_select_info()</code> is part of the
traditional session <span class="caps">API</span>. You should replace it with
<code>snmp_sess_select_info()</code> and keep a list of <span class="caps">SNMP</span>&nbsp;sessions.</p>
<p>This <span class="caps">API</span> is not available for the agent side of <em>NetSNMP</em>.</p>
<div class="footnote">
<hr>
<ol>
<li id="fn:base">
<p>The event list is global but some code could be added to bind
     a different list for each base in case you use different
     bases. This is not needed in the case of <em>lldpd</em> which uses
     only one event base.&#160;<a href="#fnref:base" rev="footnote" title="Jump back to footnote 1 in the text">&#8617;</a></p>
</li>
<li id="fn:twistedsnmp">
<p><a href="http://twistedsnmp.sourceforge.net/" title="Set of SNMP protocol implementations for Python's Twisted Matrix networking framework">TwistedSNMP</a> is an <span class="caps">SNMP</span> protocol implementation
            for <em>Twisted</em> based on <em>PySNMP</em> but is not maintained
            anymore. <a href="http://pysnmp.sourceforge.net/" title="Cross-platform, pure-Python SNMP engine implementation">PySNMP</a> comes with examples on how to
            integrate with <em>Twisted</em>.&#160;<a href="#fnref:twistedsnmp" rev="footnote" title="Jump back to footnote 2 in the text">&#8617;</a></p>
</li>
</ol>
</div>
]]>
            </content>
        </entry>
    </feed>