<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://praeclarum.org/rss" rel="self" type="application/atom+xml" /><link href="https://praeclarum.org/" rel="alternate" type="text/html" /><updated>2026-05-16T13:08:58+00:00</updated><id>https://praeclarum.org/rss</id><title type="html">praeclarum</title><subtitle>Frank A. Krueger&apos;s Blog</subtitle><entry><title type="html">Coding with Local Agents on an RTX 3090</title><link href="https://praeclarum.org/2026/05/05/coding-with-local-agents.html" rel="alternate" type="text/html" title="Coding with Local Agents on an RTX 3090" /><published>2026-05-05T00:00:00+00:00</published><updated>2026-05-05T00:00:00+00:00</updated><id>https://praeclarum.org/2026/05/05/coding-with-local-agents</id><content type="html" xml:base="https://praeclarum.org/2026/05/05/coding-with-local-agents.html"><![CDATA[<p><strong>TL;DR</strong> Running coding agents on local machines has never been easier. This article gives easy setup instructions for running Qwen 3.6 27B on an RTX 3090 in Linux. I then show how to use the model in VS Code using the LLM Gateway extension. By the end of this guide, you’ll be free of service providers and able to run a variety of OSS models.</p>

<p><strong>Update (May 16, 2026)</strong>: Over a week has gone by since posting this, so of course everything has changed. Well not <em>everything</em> but MTP is the new performance hotness and I’ve added a section about how I use it at the end.</p>

<h2 id="overview">Overview</h2>

<p>There are roughly two steps to running a local coding agent:</p>

<ol>
  <li>Get the model up and running serving the standard chat API.</li>
  <li>Connect the model to your coding environment (e.g., VS Code).</li>
</ol>

<p>There are hundreds of different OSS models, and hundreds of different model servers to choose from. You have, frankly, an overwhelming number of options to fulfill step #1. That said, if you’re looking to run these models on consumer-grade hardware, you will be looking at models in the 7B-31B parameter range. Here is one site, of many, that tries to rank these beasts: <a href="https://artificialanalysis.ai/models/open-source/small#intelligence">Artificial Analysis</a></p>

<p>For this guide, I will focus on <strong>Qwen 3.6 27B</strong> from Alibaba since it works well-enough. But <strong>Gemma 4 31B</strong> from Google is a champ and is worth also looking at.</p>

<p>There is a wonderful arms race happening with model servers right now too. A model server is a giant math library, optimized into oblivion, that deigns to run an HTTP server so it can service requests. But it also has one more crucial component: a caching layer that keeps as much of chat conversations in GPU memory as possible in order to minimize latency and compute time - the KV cache.</p>

<p>For this guide, I will focus on <a href="https://llama-cpp.com"><strong>llama.cpp</strong></a> since it is pretty popular, easy to use, and has good GPU support. But there are a number of other servers that are worth looking at, including <a href="https://vllm.ai"><strong>vLLM</strong></a>, <a href="https://ollama.com"><strong>Ollama</strong></a>, <a href="https://github.com/ml-explore/mlx-lm"><strong>MLX-LM</strong></a>, <a href="https://mtplx.com"><strong>MTPLX</strong></a>, and on and on.</p>

<h2 id="download-the-model">Download the Model</h2>

<p>This is both the easy part, and the hard part. Easy, because all you have to do is go to Hugging Face and download any of the thousands of models available. It’s hard because there are <em>so many models</em>! There’s model families, model sizes, model fine tunes, model quantizations, model formats. Oh my!</p>

<p>Most inference engines (like llama.cpp) support a specific set of model formats, so that will narrow down your options. For llama.cpp, the supported format is GGUF, so you’ll want to look for <a href="https://huggingface.co/models?search=gguf">models in that format</a>. For MLX models (to run on Apple Silicon), you’ll look in the <a href="https://huggingface.co/mlx-community">mlx-community</a>.</p>

<p>You’ll now need to pick a quantization size. Quantization is a compression method for model weights. If we took a 27 billion parameter model with 32-bit floating point weights, it would be 27B * 32 bits = 108 GB in size. Unless you have a datacenter handy, you won’t be running that. Instead, you’ll choose, say a 4-bit quantized model. This will compress the weights down to 27B * 4 bits = 13.5 GB, which is much more manageable for consumer hardware. The tradeoff is that quantization can reduce the model’s performance and accuracy, but it’s often a necessary compromise.</p>

<p>Now the RTX 3090 has 24 GB of VRAM so you might be tempted to pick a higher-bit quantization, but you have to keep in mind that the <em>context</em> and the <em>output</em> also have to fit in GPU memory. If you want long contexts and long outputs, you might have to go with a lower-bit quantization to ensure everything fits.</p>

<p>The <code class="language-plaintext highlighter-rouge">Q4_K_M</code> quantization format is a good compromise for a 27B model and a 24 GB GPU. So I’m going to download the <code class="language-plaintext highlighter-rouge">Qwen 3.6 27B Q4_K_M</code> model from Hugging Face:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>wget <span class="s2">"https://huggingface.co/unsloth/Qwen3.6-27B-GGUF/resolve/main/Qwen3.6-27B-Q4_K_M.gguf?download=true"</span>
</code></pre></div></div>

<p>(<code class="language-plaintext highlighter-rouge">wget</code> is a little dumb, so you’ll need to rename the file after downloading it since it doesn’t handle the <code class="language-plaintext highlighter-rouge">?download=true</code> part of the URL very well.)</p>

<h2 id="build-llamacpp">Build llama.cpp</h2>

<p>You can download prebuilt libraries of llama.cpp but if you want to ensure its optimized for your machine and hardware, you’ll want to build it yourself. Thankfully, it’s pretty easy to do:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>git clone https://github.com/ggerganov/llama.cpp
<span class="nb">cd </span>llama.cpp
cmake <span class="nt">-B</span> build <span class="nt">-DGGML_CUDA</span><span class="o">=</span>ON
cmake <span class="nt">--build</span> build <span class="nt">--config</span> Release <span class="nt">-j</span> 8
</code></pre></div></div>

<p>Aside from the nastiness of having to use CMake, building software doesn’t get much easier than this.</p>

<p>I passed the <code class="language-plaintext highlighter-rouge">-DGGML_CUDA=ON</code> flag to ensure that I get NVIDIA CUDA support, which is crucial for running these large models on consumer-grade hardware. If you’re on an M-series Mac, you would want to pass <code class="language-plaintext highlighter-rouge">-DGGML_METAL=ON</code> instead to get support for Apple’s Metal API.</p>

<p>If all goes well, you will have a nice, shiny <code class="language-plaintext highlighter-rouge">build/bin/llama-server</code> executable that you can use to serve your model.</p>

<h2 id="run-the-server">Run the Server</h2>

<p>You will want to run the server with a delicious soup of command line arguments. Something like this:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./build/bin/llama-server <span class="nt">-m</span> ~/Downloads/Qwen3.6-27B-Q4_K_M.gguf <span class="nt">--host</span> 0.0.0.0 <span class="nt">-ngl</span> 99 <span class="nt">-c</span> 262144 <span class="nt">-fa</span> on <span class="nt">--cache-type-k</span> q4_0 <span class="nt">--cache-type-v</span> q4_0
</code></pre></div></div>

<p>Let’s deconstruct that soup:</p>

<table>
  <thead>
    <tr>
      <th>Argument</th>
      <th>Description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">-m</code></td>
      <td>The path to the model file you downloaded.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--host 0.0.0.0</code></td>
      <td>This tells the server to listen on all network interfaces, which is necessary if you want to connect to it from another machine (e.g., your dev machine).</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">-ngl 99</code></td>
      <td>This sets the number of GPU layers to use. Setting this to 99 tells the server to use as many GPU layers as possible, which will maximize performance.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">-c 262144</code></td>
      <td>This sets the context size to 262,144 tokens, which is the maximum context size for this model. You can adjust this based on your needs and GPU memory constraints.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">-fa on</code></td>
      <td>This enables the “faster auto-regressive decoding” feature, which can improve performance.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--cache-type-k q4_0 --cache-type-v q4_0</code></td>
      <td>This sets the quantization type for the KV cache to <code class="language-plaintext highlighter-rouge">q4_0</code>, which is a good choice for performance and memory efficiency.</td>
    </tr>
  </tbody>
</table>

<p>Notice how we are quantizing the KV cache (context and outputs) as well. This is a crucial step for ensuring that the model runs efficiently on consumer-grade hardware, as the KV cache can consume a significant amount of GPU memory.</p>

<p>You’ll be greeted with typical programmer excretions:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ggml_cuda_init: found 1 CUDA devices (Total VRAM: 24159 MiB):
  Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes, VRAM: 24159 MiB
main: n_parallel is set to auto, using n_parallel = 4 and kv_unified = true
build_info: b9026-a817a22bc
system_info: n_threads = 6 (n_threads_batch = 6) / 12 | CUDA : ARCHS = 860 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 | 
Running without SSL
init: using 11 threads for HTTP server
start: binding port with default address family
main: loading model
srv    load_model: loading model '/home/fak/Downloads/Qwen3.6-27B-Q4_K_M.gguf'
common_init_result: fitting params to device memory, for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on
common_params_fit_impl: getting device memory data for initial parameters:
common_memory_breakdown_print: | memory breakdown [MiB] | total    free     self   model   context   compute    unaccounted |
common_memory_breakdown_print: |   - CUDA0 (RTX 3090)   | 24159 = 23257 + (21388 = 15345 +    5206 +     836) +      -20486 |
common_memory_breakdown_print: |   - Host               |                   1214 =   682 +       0 +     532                |
common_params_fit_impl: projected to use 21388 MiB of device memory vs. 23257 MiB of free device memory
common_params_fit_impl: will leave 1868 &gt;= 1024 MiB of free device memory, no changes needed
common_fit_params: successfully fit params to free device memory
common_fit_params: fitting params to free memory took 0.66 seconds
llama_model_loader: loaded meta data with 51 key-value pairs and 851 tensors from /home/fak/Downloads/Qwen3.6-27B-Q4_K_M.gguf (version GGUF V3 (latest))
</code></pre></div></div>

<p>Congratulations. You’re now an AI service provider. I recommend getting some seed capital and start selling access to your model to the highest bidder.</p>

<p>But before you do that…</p>

<h2 id="install-llm-gateway-in-vs-code">Install LLM Gateway in VS Code</h2>

<p>I rock VS Code for all my coding needs, and I want to be able to use my local model in its AI agent chat window thingy. To do that, I need to install an extension that connects VS Code to the standard chat API. (Why VS Code doesn’t support the API standard that literally <em>every</em> LLM server provides is beyond me.)</p>

<p>ANYWAY, I like the <a href="https://marketplace.visualstudio.com/items?itemName=AndrewButson.github-copilot-llm-gateway">LLM Gateway extension</a> by Andrew Butson.</p>

<ol>
  <li>Install that extension.</li>
  <li>Open the “GitHub Copilot LLM Gateway: Configure Server” UI from the command palette and enter the URL for your server (e.g., <code class="language-plaintext highlighter-rouge">http://my-awesome-server.local:8080</code>).</li>
  <li>Test the connection with the “GitHub Copilot LLM Gateway: Test Server Connection” command. It should say “Found 1 model(s)” if everything is working. (If it’s not working, email James Montemagno and ask him for help.)</li>
  <li>Open the “Chat: Manage Language Models” UI from the command palette. You should see your model listed but it will probably be grayed out for some reason. Click it, click the eye ball (gross!), and it should now be active and ready to use in the chat window.</li>
  <li>Open the chat window, and click the model selector. Choose “Other Models”, scroll, and scroll, looking for your model. It’s there somewhere. I promise. You might doubt it, but have faith. When in doubt, keep scrolling. You can do it. You found it! Click it, and now you can use your local model in the chat window!</li>
</ol>

<h2 id="mtp-for-speed">MTP for Speed</h2>

<p>(Added May 16, 2026)</p>

<p>Since posting this article, Multi-Token Prediction (MTP) has been released, and it is a game changer for performance. MTP is a new parallelism method that <em>somehow</em> makes things faster by doing more work. Weird, I know. Instead of the purely serial operation of (1) generate a token, (2) add it to the context, (3) GOTO 1, MTP uses a much smaller model to <em>quickly</em> do the 1-2-3 dance for a few tokens and then uses the real model <strong>to verify the results</strong>. Given the nature of these beasts, the smaller model takes up very little wall time but still has OK-ish accuracy. The big model, instead of being just a generator, is used to verify the probabilities of these new tokens. It can do that quickly because it doesn’t need to test them one at a time, but can test them all in parallel (vs serial). Since modern LLMs are memory bandwidth bound, not compute bound, this parallel execution is “free”. The result is a huge speed boost, about 1.4x-1.8x. It’s a crazy hack, and I’m here for it.</p>

<p>As of this writing, MTP is still a work-in-progress for llama.cpp, but it is available in PRs and forks. I’m compiling using <a href="https://github.com/am17an/llama.cpp/tree/mtp-clean">am17an’s fork</a>.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./build/bin/llama-server <span class="nt">-hf</span> unsloth/Qwen3.6-27B-MTP-GGUF:UD-Q4_K_XL <span class="nt">--host</span> 0.0.0.0 <span class="nt">-c</span> 150000 <span class="nt">-ngl</span> 99 <span class="nt">-fa</span> on <span class="nt">--cache-type-k</span> q4_0 <span class="nt">--cache-type-v</span> q4_0 <span class="nt">--temp</span> 0.6 <span class="nt">--top-p</span> 0.95 <span class="nt">--top-k</span> 20 <span class="nt">--presence-penalty</span> 0.0 <span class="nt">--min-p</span> 0.00 <span class="nt">--spec-type</span> draft-mtp <span class="nt">--spec-draft-n-max</span> 2
</code></pre></div></div>

<p>The important new args are:</p>

<table>
  <thead>
    <tr>
      <th>Argument</th>
      <th>Description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--spec-type draft-mtp</code></td>
      <td>This tells the server to use the MTP parallelism method.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--spec-draft-n-max 2</code></td>
      <td>This sets the maximum number of tokens to predict in parallel. 2 is a very conservative choice.</td>
    </tr>
  </tbody>
</table>

<p>I’ve also modified a few other sampling parameters based on recommendations for coding environments:</p>

<table>
  <thead>
    <tr>
      <th>Argument</th>
      <th>Description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--temp 0.6</code></td>
      <td>This sets the temperature to 0.6, which is a good choice for coding tasks as it encourages more deterministic outputs while still allowing for some creativity.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--top-p 0.95</code></td>
      <td>This sets the nucleus sampling parameter to 0.95, which helps to ensure that the model generates more relevant and coherent code by focusing on the most probable tokens.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--top-k 20</code></td>
      <td>This sets the top-k sampling parameter to 20, which limits the number of tokens considered at each step to the 20 most likely, further improving the relevance of the generated code.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--presence-penalty 0.0</code></td>
      <td>This sets the presence penalty to 0.0, which means that the model will not be penalized for generating tokens that have already appeared in the context, which can be beneficial for coding tasks where repetition of certain tokens (e.g., variable names, function names) is common.</td>
    </tr>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">--min-p 0.00</code></td>
      <td>This sets the minimum probability threshold to 0.00, which means that the model will consider all tokens regardless of their probability, allowing for a wider range of potential outputs.</td>
    </tr>
  </tbody>
</table>

<p>I also switched to the <code class="language-plaintext highlighter-rouge">Qwen3.6-27B-MTP-GGUF</code> model, which includes the smaller MTP model needed for the parallel token prediction. And I switched to the <code class="language-plaintext highlighter-rouge">UD-Q4_K_XL</code> quantization format, because guessing which quantization format to use is half the fun of self-hosting models.</p>

<h2 id="is-it-worth-it">Is it Worth It?</h2>

<p>What does an RTX machine cost these days?</p>

<table>
  <thead>
    <tr>
      <th>Component</th>
      <th>Price (USD)</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>RTX 3090</td>
      <td>$1500</td>
    </tr>
    <tr>
      <td>CPU</td>
      <td>$300</td>
    </tr>
    <tr>
      <td>64 GB RAM</td>
      <td>$700 (what has the world come to?)</td>
    </tr>
    <tr>
      <td>HDD</td>
      <td>$200</td>
    </tr>
    <tr>
      <td>PSU</td>
      <td>$150</td>
    </tr>
    <tr>
      <td>Case</td>
      <td>$100</td>
    </tr>
    <tr>
      <td><strong>Total</strong></td>
      <td><strong>$2,950</strong></td>
    </tr>
  </tbody>
</table>

<p>So for about $3,000 you can have your very own local coding agent. That’s a pretty hefty price tag, but it’s also a one-time cost.</p>

<p>In a typical day, I burn through about 50,000,000 tokens. 500,000 output tokens, 1,750,000 input tokens, and the rest are cache hits. At 40 tok/second (typical for my RTX), my compute day is about <code class="language-plaintext highlighter-rouge">(500,000 + 1,750,000) / 40 = 56,250 seconds</code>, which is about <strong>15.6 hours of compute time per day</strong>. Ugh.</p>

<p>Right now, you can use DeepSeek for $3.48 per 1,000,000 output tokens, $1.74 for inputs, and $0.0145 for cache hits. So my daily cost would be <code class="language-plaintext highlighter-rouge">(500,000 / 1,000,000) * 3.48 + (1,750,000 / 1,000,000) * 1.74 + (47,750,000 / 1,000,000) * 0.0145 = $5.48</code> per day. That’s about $1,400 per year (five day work weeks). So in about 2 years, I would recoup the cost of running my own local agent. Hmmm…</p>

<p>So you might not want to run out and buy your own server. But, if you do have an over-provisioned gaming rig, well you might as well put it to use doing something useful. ;-)
X</p>

<h2 id="conclusion">Conclusion</h2>

<p>Since 2017 I have been advocating running local models. I’m amazed that it’s now possible to run 27B parameter variants on consumer hardware. (In my mind, 7B is still tremendous.) These are real models, able to write good code, in a fully agentic harness. Amazing.</p>

<p>While the up front hardware cost, the noise of fans, and the slower response rates are not ideal and don’t make this an easy win, I have a different perspective. AI coding has changed how I work. Permanently. I do not want to go back to writing every line of code by hand, it seems absurd now. But I also don’t like being at the mercy of large cloud providers. Having the ability to run my own local agent, even with its limitations, is a huge win for me. I know, even with no internet connection, I can still do what I love: code.</p>

<p><strong>Colophon:</strong> Written by hand. Proofread and edited by Qwen 3.6 27B running on an RTX 3090.</p>]]></content><author><name></name></author><category term="article" /><summary type="html"><![CDATA[TL;DR Running coding agents on local machines has never been easier. This article gives easy setup instructions for running Qwen 3.6 27B on an RTX 3090 in Linux. I then show how to use the model in VS Code using the LLM Gateway extension. By the end of this guide, you’ll be free of service providers and able to run a variety of OSS models.]]></summary></entry><entry><title type="html">AppReviewFetch - Because App Store Connect Shouldn’t Require a Web Browser</title><link href="https://praeclarum.org/2026/01/21/arfetch.html" rel="alternate" type="text/html" title="AppReviewFetch - Because App Store Connect Shouldn’t Require a Web Browser" /><published>2026-01-21T00:00:00+00:00</published><updated>2026-01-21T00:00:00+00:00</updated><id>https://praeclarum.org/2026/01/21/arfetch</id><content type="html" xml:base="https://praeclarum.org/2026/01/21/arfetch.html"><![CDATA[<p><strong>TL;DR</strong> I wrote a .NET library, CLI tool, and MCP server for fetching app reviews from App Store Connect (with Google Play support in the works). You can use them yourself with the NuGet packages <a href="https://www.nuget.org/packages/AppReviewFetch">AppReviewFetch</a>, <a href="https://www.nuget.org/packages/AppReviewFetch.Cli">AppReviewFetch.Cli</a>, and <a href="https://www.nuget.org/packages/AppReviewFetch.Mcp">AppReviewFetch.Mcp</a>. The MCP server is especially exciting because it lets AI assistants analyze your reviews for you.</p>

<p>If you’re an Apple app developer, you’ve done The Dance. You know the one. Open Safari. Navigate to App Store Connect. Click through seventeen dropdown menus. Wait for the world’s slowest web app to load. Click “My Apps.” Wait again. Click your app. Wait. Click “Ratings and Reviews.” Wait. Squint at a paginated table that shows you 10 reviews at a time. Wonder what your users are actually thinking. Give up and go make coffee.</p>

<p>This is 2026. I have apps that have been on the App Store for well over a decade. I have accumulated a lot of reviews. I want to <em>read</em> those reviews. I want to <em>search</em> those reviews. I want to understand what my users love and what makes them cry. And I don’t want to do any of that through a web browser that feels like it’s running on a Pentium II.</p>

<p>So I built AppReviewFetch.</p>

<h2 id="the-boring-part-the-library">The Boring Part (The Library)</h2>

<p>At its core, AppReviewFetch is a .NET library for talking to the App Store Connect API. It handles the JWT authentication dance (which, let me tell you, is a <em>dance</em>), fetches your apps, fetches your reviews, handles pagination, and gives you nice strongly-typed objects to work with.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dotnet add package AppReviewFetch
</code></pre></div></div>

<div class="language-csharp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">using</span> <span class="nn">AppReviewFetch</span><span class="p">;</span>

<span class="kt">var</span> <span class="n">service</span> <span class="p">=</span> <span class="k">new</span> <span class="nf">AppStoreConnectService</span><span class="p">();</span>
<span class="kt">var</span> <span class="n">reviews</span> <span class="p">=</span> <span class="k">await</span> <span class="n">service</span><span class="p">.</span><span class="nf">GetReviewsAsync</span><span class="p">(</span><span class="s">"YOUR_APP_ID"</span><span class="p">,</span> <span class="k">new</span> <span class="n">ReviewRequest</span>
<span class="p">{</span>
    <span class="n">SortOrder</span> <span class="p">=</span> <span class="n">ReviewSortOrder</span><span class="p">.</span><span class="n">NewestFirst</span><span class="p">,</span>
    <span class="n">Limit</span> <span class="p">=</span> <span class="m">100</span><span class="p">,</span>
    <span class="n">Country</span> <span class="p">=</span> <span class="s">"US"</span>
<span class="p">});</span>

<span class="k">foreach</span> <span class="p">(</span><span class="kt">var</span> <span class="n">review</span> <span class="k">in</span> <span class="n">reviews</span><span class="p">.</span><span class="n">Reviews</span><span class="p">)</span>
<span class="p">{</span>
    <span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="s">$"</span><span class="p">{</span><span class="n">review</span><span class="p">.</span><span class="n">Rating</span><span class="p">}</span><span class="s">/5 - </span><span class="p">{</span><span class="n">review</span><span class="p">.</span><span class="n">Title</span><span class="p">}</span><span class="s">"</span><span class="p">);</span>
    <span class="n">Console</span><span class="p">.</span><span class="nf">WriteLine</span><span class="p">(</span><span class="n">review</span><span class="p">.</span><span class="n">Body</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Is this exciting? Not particularly. It’s plumbing, good plumbing, I like to think. Clean interfaces, proper exception handling, support for dependency injection are all good plumbing. It’s the kind of code that has to exist so that more interesting things can be built on top of it.</p>

<p>If you want an easy API to build your own tools, this is it.</p>

<p>But I know you didn’t come here for plumbing.</p>

<h2 id="the-actually-useful-part-the-cli">The Actually Useful Part (The CLI)</h2>

<p>Here’s my pitch: what if you never had to open App Store Connect in a web browser again to check your reviews?</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dotnet tool <span class="nb">install</span> <span class="nt">-g</span> AppReviewFetch.Cli
arfetch
</code></pre></div></div>

<p>That’s it. Now you have a beautiful, interactive REPL that does everything the web interface does, except faster and without making you want to throw your computer out the window.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>arfetch&gt; list
┌────────────────────────────────────────────────────────────────────────────┐
│                              Available Apps                                 │
├──────────────┬─────────────────────────┬─────────────────────────┬─────────┤
│     App ID   │          Name           │        Bundle ID        │  Store  │
├──────────────┼─────────────────────────┼─────────────────────────┼─────────┤
│  123456789   │        iCircuit         │ com.krueger.icircuit    │  Apple  │
│  987654321   │          Calca          │ com.krueger.calca       │  Apple  │
└──────────────┴─────────────────────────┴─────────────────────────┴─────────┘

arfetch&gt; fetch iCircuit
⭐⭐⭐⭐⭐ "Best circuit simulator ever!"
"I've been using this app for years and it just keeps getting better..."

⭐⭐⭐⭐☆ "Great app, one suggestion"
"Love the Arduino support! Would be nice if it could also simulate..."

arfetch&gt; fetch iCircuit US
[Shows only reviews from the United States]

arfetch&gt; export reviews.csv
✓ Exported 342 reviews to reviews.csv
</code></pre></div></div>

<p>You can search by app name, bundle ID, or app ID. You can filter by country. You can export to CSV for further analysis. The reviews are beautifully formatted with colors and star ratings. Developer responses are included. Pagination is handled with a simple “more” prompt.</p>

<p>It’s the kind of tool I always wished existed. So I built it.</p>

<p>Setup takes about two minutes: you need an App Store Connect API key (which Apple now makes reasonably easy to generate), and you run <code class="language-plaintext highlighter-rouge">arfetch setup</code> which walks you through entering your credentials. Done.</p>

<h2 id="the-wild-part-the-mcp-server">The Wild Part (The MCP Server)</h2>

<p>Okay. Here’s where things get fun.</p>

<p><a href="https://modelcontextprotocol.io/">Model Context Protocol</a> is a standardized way for AI assistants to access external data sources. GitHub Copilot supports it. Claude supports it. And now AppReviewFetch supports it.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dotnet tool <span class="nb">install</span> <span class="nt">-g</span> AppReviewFetch.Mcp
</code></pre></div></div>

<p>Add a few lines to your VS Code settings or Claude Desktop config, and suddenly your AI assistant can read your app reviews.</p>

<p>This is where things get really interesting.</p>

<p>You can ask GitHub Copilot, “What are users complaining about in my app?” and it will actually go fetch your reviews, analyze them, and tell you. Not hallucinated reviews from its training data. <em>Your actual reviews</em>. Right now. In real time.</p>

<h3 id="things-you-can-do-that-sound-made-up-but-arent">Things You Can Do That Sound Made Up But Aren’t</h3>

<p><strong>“Analyze the sentiment of my reviews over the past month and tell me if things are getting better or worse.”</strong></p>

<p>The MCP server will fetch hundreds of reviews, compute rating distributions, analyze trends, and give you a comprehensive report. No clicking through web pages. No exporting CSVs and opening them in Excel. Just ask.</p>

<p><strong>“Show me all the 1-star reviews and categorize them by complaint type.”</strong></p>

<p>Your AI assistant will identify patterns you might have missed. “42% of negative reviews mention performance issues on older devices. 28% mention difficulty with the tutorial. 17% are people who clearly don’t understand what the app does.”</p>

<p><strong>“Draft responses to this week’s negative reviews.”</strong></p>

<p>AI assistants are pretty good at writing polite, helpful customer service responses. Now they can do it with full context of what the user actually said.</p>

<p><strong>“Compare my review sentiment to what users are saying on Reddit about my app.”</strong></p>

<p>Combine the MCP server with web search and you’ve got a comprehensive view of user sentiment across platforms.</p>

<p><strong>“Every morning, give me a summary of new reviews across all my apps.”</strong></p>

<p>Set up a simple automation and never be surprised by a sudden influx of negative reviews again.</p>

<p><strong>“Find reviews that mention crashes and correlate them with my crash reports.”</strong></p>

<p>Cross-reference user-reported issues with your crash analytics to prioritize bug fixes.</p>

<h3 id="the-really-wild-ideas">The Really Wild Ideas</h3>

<p>Here’s where my imagination starts running ahead of my implementation:</p>

<ul>
  <li><strong>Automated Triage</strong>: Have an AI assistant monitor reviews and file GitHub issues for reported bugs, tagged by severity based on the language used.</li>
  <li><strong>A/B Testing Insights</strong>: After a release, ask “Are users responding better to the new design?” and get an answer based on actual review text analysis.</li>
  <li><strong>Competitive Analysis</strong>: If you’re analyzing reviews for your own apps, what about… other apps? (The API only gives you access to your own apps, but the architecture is there for when that changes.)</li>
  <li><strong>Multi-Language Sentiment</strong>: Reviews come in many languages. AI is pretty good at understanding all of them. “What are my Japanese users saying that’s different from my American users?”</li>
  <li><strong>Proactive Outreach</strong>: Identify users who left thoughtful negative reviews and might be good beta testers for new features.</li>
</ul>

<p>I’ve built the foundation. The MCP server exposes three tools—ListApps, FetchReviews, and AnalyzeReviews—and the AI takes it from there. The creativity is limited only by what you ask it to do.</p>

<h2 id="the-setup">The Setup</h2>

<p>Getting started is straightforward:</p>

<ol>
  <li><strong>Get an App Store Connect API Key</strong>
    <ul>
      <li>Go to App Store Connect → Users and Access → Keys</li>
      <li>Generate a key with App Manager, Customer Support, or Admin access</li>
      <li>Download the .p8 file and note your Key ID and Issuer ID</li>
    </ul>
  </li>
  <li><strong>Install and Configure</strong>
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dotnet tool <span class="nb">install</span> <span class="nt">-g</span> AppReviewFetch.Cli
arfetch setup
</code></pre></div>    </div>
  </li>
  <li><strong>For MCP</strong>: Add the server to your AI assistant’s config:
    <div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"servers"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"appreviewfetch"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"command"</span><span class="p">:</span><span class="w"> </span><span class="s2">"arfetch-mcp"</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div>    </div>
  </li>
</ol>

<p>That’s it. The CLI and MCP server share the same credentials file, so configure once, use everywhere.</p>

<h2 id="the-three-packages">The Three Packages</h2>

<ul>
  <li><strong><a href="https://www.nuget.org/packages/AppReviewFetch">AppReviewFetch</a></strong> - The core library. Use this if you’re building your own tooling.</li>
  <li><strong><a href="https://www.nuget.org/packages/AppReviewFetch.Cli">AppReviewFetch.Cli</a></strong> - The interactive CLI. Use this if you want a better alternative to App Store Connect’s web interface.</li>
  <li><strong><a href="https://www.nuget.org/packages/AppReviewFetch.Mcp">AppReviewFetch.Mcp</a></strong> - The MCP server. Use this if you want AI to do the analysis for you.</li>
</ul>

<p>All three are MIT licensed and <a href="https://github.com/praeclarum/AppReviewFetch">available on GitHub</a>.</p>

<h2 id="whats-next">What’s Next</h2>

<p>Google Play support is in early beta. The API is different (shocker), and Google’s authentication is its own special adventure, but the architecture is already there—the library is built around an <code class="language-plaintext highlighter-rouge">IAppReviewService</code> interface that can support multiple stores.</p>

<p>I also want to add Windows Store support, but I haven’t looked into that API yet.</p>

<p>There are of course other sources of feedback for apps, email and social media being the big ones. If I can find reliable APIs for those, I may add them as well.</p>

<h2 id="the-real-point">The Real Point</h2>

<p>I’ve been an iOS developer since 2008. Eighteen years. That’s a lot of reviews. That’s a lot of time spent in App Store Connect’s web interface, clicking and waiting and clicking and waiting.</p>

<p>Developer tools should respect developers’ time. They should be fast. They should be automatable. They should integrate with the workflows we’re already using.</p>

<p>App Store Connect is none of those things. So I built something that is.</p>

<p>And now that AI assistants can actually understand and act on data from external sources, the potential for what we can do with our apps’ feedback has exploded. The reviews are just sitting there. They’re full of insights. We just need better ways to access them.</p>

<table>
  <tbody>
    <tr>
      <td><a href="https://github.com/praeclarum/AppReviewFetch">AppReviewFetch on GitHub</a></td>
      <td><a href="https://www.nuget.org/packages?q=AppReviewFetch">NuGet Packages</a></td>
    </tr>
  </tbody>
</table>]]></content><author><name></name></author><category term="announcement" /><summary type="html"><![CDATA[TL;DR I wrote a .NET library, CLI tool, and MCP server for fetching app reviews from App Store Connect (with Google Play support in the works). You can use them yourself with the NuGet packages AppReviewFetch, AppReviewFetch.Cli, and AppReviewFetch.Mcp. The MCP server is especially exciting because it lets AI assistants analyze your reviews for you.]]></summary></entry><entry><title type="html">The Many Ways to Deploy iOS Apps in 2025</title><link href="https://praeclarum.org/2025/10/21/many-ways-to-deploy-ios.html" rel="alternate" type="text/html" title="The Many Ways to Deploy iOS Apps in 2025" /><published>2025-10-21T00:00:00+00:00</published><updated>2025-10-21T00:00:00+00:00</updated><id>https://praeclarum.org/2025/10/21/many-ways-to-deploy-ios</id><content type="html" xml:base="https://praeclarum.org/2025/10/21/many-ways-to-deploy-ios.html"><![CDATA[<p><strong>TL;DR</strong> There are no less than 4 different ways you can deploy your iOS app to a physical device for testing. I enumerate all of them below, along with their pros and cons. In the end, I recommend using <code class="language-plaintext highlighter-rouge">dotnet build -t:Run</code> if you can, as it is the simplest and most reliable method. But knowing the alternatives can be useful in certain situations.</p>

<h2 id="from-none-to-many">From None to Many</h2>

<p>Just a couple years ago, I didn’t know <em>any</em> of the ways to deploy my .NET/MAUI iOS apps to a physical device. I only knew how to use Visual Studio for Mac’s built-in run button, which worked fine for me. Until they killed it.</p>

<p>At present, there are only two IDEs for macOS that support .NET MAUI iOS development: JetBrains Rider and Visual Studio Code. Rider <em>should</em> be nice, but has terrible bugs where it will sometimes try to deploy <code class="language-plaintext highlighter-rouge">iossimulator</code> builds to the device, or that persistence bug where it fails to rebuild apps when dependencies change, or it just doesn’t detect devices at all. It’s not at all reliable. Visual Studio Code <em>should</em> be nice too, but it basically requires that you write MAUI apps for anything to work. If you’re like me and prefer to build native UIs, then you’re out of luck.</p>

<p>But there is good news, and I am here to deliver it. There’s not one but <em>four</em> different ways you can deploy your iOS app to a physical device for testing, none of which require an IDE. Here they are, in order of my preference with my own heart-felt pros and cons.</p>

<h2 id="1-dotnet-build--trun">1. <code class="language-plaintext highlighter-rouge">dotnet build -t:Run</code></h2>

<p>It turns out, good old MSBuild has a built-in task to do exactly what we want. You can use the <code class="language-plaintext highlighter-rouge">dotnet build</code> command with the <code class="language-plaintext highlighter-rouge">-t:Run</code> target to build and deploy your app to a connected device in one step. How is this different from <code class="language-plaintext highlighter-rouge">dotnet run</code>? I don’t know. Don’t ask me, ask Microsoft.</p>

<p>You can read all about <code class="language-plaintext highlighter-rouge">dotnet build -t:Run</code> in the official <a href="https://learn.microsoft.com/en-us/dotnet/maui/ios/cli?view=net-maui-9.0#launch-the-app-on-a-device">“Launch the app on a device” documentation</a>.</p>

<p>Just run:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dotnet build <span class="nt">-t</span>:Run <span class="nt">-f</span> net9.0-ios <span class="nt">-p</span>:RuntimeIdentifier<span class="o">=</span>ios-arm64 <span class="nt">-p</span>:_DeviceName<span class="o">=</span>MY_SPECIFIC_UDID MyApp.csproj
</code></pre></div></div>

<p>and replace <code class="language-plaintext highlighter-rouge">MY_SPECIFIC_UDID</code> with your device’s UDID and <code class="language-plaintext highlighter-rouge">MyApp.csproj</code> with the name of your project file (or elide it if you’re in the project’s directory). Adjust the <code class="language-plaintext highlighter-rouge">-f</code> target framework to match your project (for example, <code class="language-plaintext highlighter-rouge">net8.0-ios</code>).</p>

<p>A UDID is a Unique Device Identifier, a unique string that identifies your iOS device. It’s like a UUID, but there’s a D instead of a U.</p>

<h3 id="how-to-find-your-devices-udid">How to Find Your Device’s UDID</h3>

<p>Don’t you worry, here are 4 easy ways to find your device’s UDID:</p>

<ol>
  <li><strong>Xcode</strong>: Open Xcode, go to “Window” &gt; “Devices and Simulators”, and select your device. The UDID will be listed there.</li>
  <li><strong>xcrun</strong>: Buried in every Xcode installation is the <code class="language-plaintext highlighter-rouge">xcrun</code> command line tool. You can use it to list connected devices and their UDIDs with this command:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> xcrun devicectl list devices <span class="nt">--hide-default-columns</span> <span class="nt">--columns</span> Name <span class="nt">--columns</span> UDID
</code></pre></div>    </div>
    <p>Output will look something like this:</p>
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Name                    UDID                     
 ---------------------   -------------------------
 Precious XXIII          00008030-000E409C0E10802E
 Precious XXIV           00008103-001829AE14BB001E
 Precious XXVII          00008140-000A0DC83013C01C
 Precious XXVIII         00008150-001C2C3E0EFB801C
</code></pre></div>    </div>
    <p>The only problem is that it will only report modern devices with their modern UDIDs. If you have devices running iOS 15 or earlier, Apple doesn’t support them with <code class="language-plaintext highlighter-rouge">devicectl</code>, so you’ll have to use one of the other methods.</p>
  </li>
  <li><strong>mlaunch</strong>: Buried so deeply in the .NET iOS workloads that you’ll need an oxygen mask to find it, is another wonderful tool called <code class="language-plaintext highlighter-rouge">mlaunch</code>. You can use it to list connected devices and their UDIDs with this command:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> mlaunch <span class="nt">--listdev</span>
</code></pre></div>    </div>
    <p>Output will look something like this:</p>
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> Precious XXIII: 00008030-000E409C0E10802E
 Precious XXIV: 00008103-001829AE14BB001E
 Precious XXVII: 00008140-000A0DC83013C01C
 Precious XXVIII: 00008150-001C2C3E0EFB801C
 Precious XVIV: c52a6fd19cc179aad6696abe67cce53705bf22d0
</code></pre></div>    </div>
    <p>(along with a bunch of errors). This works for all devices, old and new. Don’t make fun of the XVIV, Roman numerals are hard.</p>
  </li>
  <li><strong>ios-deploy</strong>: Thanks to a few saints masquerading as software developers, there is the <code class="language-plaintext highlighter-rouge">ios-deploy</code> tool available one agonizing <code class="language-plaintext highlighter-rouge">brew install ios-deploy</code> away. You can use it to list connected devices and their UDIDs with this command:
    <div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code> ios-deploy <span class="nt">--detect</span>
</code></pre></div>    </div>
    <p>Output will look something like this:</p>
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> [....] Found 00008103-001829AE14BB001E (J517AP, iPad Pro 3G (11"), iphoneos, arm64e, 26.0, 23A340) a.k.a. 'Precious XXIV'
 [....] Found c52a6fd19cc179aad6696abe67cce53705bf22d0 (D11AP, iPhone 7 Plus, iphoneos, arm64, 15.7, 19H12) a.k.a. 'Precious XVIV'
</code></pre></div>    </div>
    <p>This works for all devices, old and new <em>so long as they are available</em> for deployment. You see, the other tools report devices that are paired to the host machine, even if they are not available for deployment (e.g. locked, no trust relationship, etc.) but <code class="language-plaintext highlighter-rouge">ios-deploy</code> only reports devices that are actually available for deployment. This can be a blessing or a curse depending on your mood.</p>
  </li>
</ol>

<h2 id="2-mlaunch">2. <code class="language-plaintext highlighter-rouge">mlaunch</code></h2>

<p>Buried so deeply in the .NET iOS workloads that you’ll need a spatula and a crowbar to find it, is the wonder aptly named <code class="language-plaintext highlighter-rouge">mlaunch</code>. You can use it to deploy your app to a connected device in two steps: first install the app, then launch it.</p>

<p>There is no documentation for <code class="language-plaintext highlighter-rouge">mlaunch</code>, but if you run it with the <code class="language-plaintext highlighter-rouge">--help</code> flag, you can see the myriad of options available to you. We shall discuss the most mundane ones here.</p>

<p>To install your app, you’re first going to need to build it. You can do this with the following command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>dotnet build <span class="nt">-f</span> net9.0-ios <span class="nt">-r</span> ios-arm64 MyApp.csproj
</code></pre></div></div>

<p>Now you can install the app with this command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mlaunch <span class="nt">--installdev</span><span class="o">=</span>bin/Debug/net9.0-ios/ios-arm64/MyApp.app <span class="nt">--devname</span><span class="o">=</span>MY_SPECIFIC_UDID
</code></pre></div></div>

<p>The location of mlaunch is curious, spurious, and mysterious. You can find it where all the great workloads are hidden, I mean stored. Somewhere beneath <code class="language-plaintext highlighter-rouge">/usr/local/share/dotnet/packs/Microsoft.iOS.Sdk.*</code></p>

<p>For example, on my machine it is located at: <code class="language-plaintext highlighter-rouge">/usr/local/share/dotnet/packs/Microsoft.iOS.Sdk.net10.0_18.5/18.5.10727-net10-rc.1/tools/lib/mlaunch/mlaunch.app/Contents/MacOS/mlaunch</code></p>

<p>Now that your app is installed, you can launch it with this command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mlaunch <span class="nt">--launchdevbundleid</span><span class="o">=</span>MY_BUNDLE_ID <span class="nt">--devname</span><span class="o">=</span>MY_SPECIFIC_UDID
</code></pre></div></div>

<p>Replace <code class="language-plaintext highlighter-rouge">MY_BUNDLE_ID</code> with your app’s bundle identifier (e.g. <code class="language-plaintext highlighter-rouge">com.mycompany.myapp</code>) and <code class="language-plaintext highlighter-rouge">MY_SPECIFIC_UDID</code> with your device’s UDID.</p>

<p>That’s it, you’re now a pro <code class="language-plaintext highlighter-rouge">mlaunch</code> user!</p>

<h2 id="3-xcrun">3. <code class="language-plaintext highlighter-rouge">xcrun</code></h2>

<p>If you are not in the mood to dig through the .NET SDK installation to find <code class="language-plaintext highlighter-rouge">mlaunch</code>, you can use the <code class="language-plaintext highlighter-rouge">xcrun</code> command-line tool that comes with Xcode. It has a subcommand called <code class="language-plaintext highlighter-rouge">devicectl</code> that you can use to install and launch your app on a connected device.</p>

<p>Build your app, and then run this command to install it:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>xcrun devicectl device <span class="nb">install </span>app <span class="nt">--device</span> MY_SPECIFIC_UDID bin/Debug/net9.0-ios/ios-arm64/MyApp.app
</code></pre></div></div>

<p>Then run this command to launch it:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>xcrun devicectl device process launch <span class="nt">--terminate-existing</span> <span class="nt">--console</span> <span class="nt">--device</span> MY_SPECIFIC_UDID MY_BUNDLE_ID
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">--terminate-existing</code> flag will kill any existing instance of your app before launching it, and the <code class="language-plaintext highlighter-rouge">--console</code> flag will stream the app’s console output to your terminal. Hot.</p>

<p>Sadly, <code class="language-plaintext highlighter-rouge">xcrun devicectl</code> does not support older devices with long-in-the-tooth iOS versions, so if you have an iOS 15 or older device, you should go spelunking for <code class="language-plaintext highlighter-rouge">mlaunch</code> instead.</p>

<p>You’ve made it this far, you are truly a command line iOS hacker dev. But there’s one more way to deploy your app, and it’s the most hacky way of them all.</p>

<h2 id="4-ios-deploy">4. <code class="language-plaintext highlighter-rouge">ios-deploy</code></h2>

<p>If you aren’t satisfied with first- or third‑party tools, and want to dabble with “fourth‑party” greatness, then you can use the <code class="language-plaintext highlighter-rouge">ios-deploy</code> command-line tool. This tool is not officially supported by Apple or Microsoft, but it is widely used by the iOS development community.</p>

<p>First, make sure you have <code class="language-plaintext highlighter-rouge">ios-deploy</code> installed. If you haven’t done this yet, you can install it using Homebrew:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>brew <span class="nb">install </span>ios-deploy
</code></pre></div></div>

<p>Make sure you have a full battery, a gigabit internet connection, and a giant cup of coffee, because this may take a while. <code class="language-plaintext highlighter-rouge">ios-deploy</code> is small and quick to install, but Homebrew has to download 1/2 of the internet and at least 3 different versions of Python before it will let you have your machine back.</p>

<p>Once you have <code class="language-plaintext highlighter-rouge">ios-deploy</code> installed, you can use it to deploy and run your app. Build your app, then run:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ios-deploy <span class="nt">--justlaunch</span> <span class="nt">--debug</span> <span class="nt">-i</span> MY_SPECIFIC_UDID <span class="nt">-b</span> bin/Debug/net9.0-ios/ios-arm64/MyApp.app
</code></pre></div></div>

<p>And that’s it! You’ve now deployed your app using <code class="language-plaintext highlighter-rouge">ios-deploy</code>. Check out the <a href="https://github.com/ios-control/ios-deploy">ios-deploy README on GitHub</a> for more information and advanced usage.</p>

<h2 id="conclusion">Conclusion</h2>

<p>There you have it, four different ways to deploy your iOS app to a physical device for testing. No IDE needed. My personal favorite is <code class="language-plaintext highlighter-rouge">dotnet build -t:Run</code> because it’s the simplest and most reliable method. But knowing the alternatives can be useful in certain situations, when showing off to your mom, looking cool at parties, or when you just want to feel like a true command‑line iOS hacker dev.</p>]]></content><author><name></name></author><category term="article" /><summary type="html"><![CDATA[TL;DR There are no less than 4 different ways you can deploy your iOS app to a physical device for testing. I enumerate all of them below, along with their pros and cons. In the end, I recommend using dotnet build -t:Run if you can, as it is the simplest and most reliable method. But knowing the alternatives can be useful in certain situations.]]></summary></entry><entry><title type="html">Updating App Icons for iOS and macOS 26</title><link href="https://praeclarum.org/2025/09/12/app-icons.html" rel="alternate" type="text/html" title="Updating App Icons for iOS and macOS 26" /><published>2025-09-12T00:00:00+00:00</published><updated>2025-09-12T00:00:00+00:00</updated><id>https://praeclarum.org/2025/09/12/app-icons</id><content type="html" xml:base="https://praeclarum.org/2025/09/12/app-icons.html"><![CDATA[<p><strong>TL;DR</strong> Apple has introduced a new layered approach to app icons in iOS and macOS 26.
This post explains how to update your .NET and MAUI app icons to comply with the new requirements and ensure your app looks great on all devices. I outline two different approaches from easiest (do nothing) to hardest (composing and compiling your icons). I also outline the steps needed to manually compile your icons using the <code class="language-plaintext highlighter-rouge">actool</code> command line tool.</p>

<h2 id="the-new-icon-requirements">The New Icon Requirements</h2>

<p>With the release of iOS and macOS 26, Apple has introduced a new way to handle app icons. Instead of using a single static image, developers can now create layered icons that adapt to different contexts and styles.</p>

<p>All icons are now rounded rectangles with a required background layer and then up to <strong>four</strong> additional layers on top. These layers can each have their own material properties such as specularity (shininess), transparency (along with frosting), and drop shadows. Each layer is treated as being physically above the previous layer, creating a variety of 3D effects.</p>

<p>The layers are composited by the OS to create the final icon, which dynamically adapts to the environment, and, more importantly, user settings.</p>

<p>So how do you take advantage of this new feature in your .NET and MAUI apps? I’ll give you 3 options from easiest to hardest.</p>

<h2 id="option-1-do-nothing">Option 1: Do Nothing</h2>

<p>The easiest option is to do nothing. If you don’t update your app icons, Apple will analyze your icon and automatically create a layered version for you. This is the easiest option, but it may not give you the best results.</p>

<h3 id="simple-icons">Simple Icons</h3>

<p>This approach works well if your icon is a typical two-layer icon with a background and some distinct foreground imagery. In this case, Apple will attempt to separate the background and foreground layers and apply some default material properties to each layer.</p>

<p>For example, here is a simple icon with a blue background and white foreground and a little drop shadow. To its right is the automatically generated layered icon.</p>

<p><img src="/images/2025/app_icons_simple_compare.png" alt="Simple Icon Compare" /></p>

<p>Note that the corners are rounded and the system applied
specular highlights to the border. The drop shadow and
the gradient background were preserved.</p>

<p>If your icon is simple like this, you can probably get away with doing nothing. You’ll miss out on the 3D layering effects, but your icon will still look good and will fit in with the OS.</p>

<p>What if your icon is more complex?</p>

<h3 id="complex-icons">Complex Icons</h3>

<p>Complex icons, those that use transparency to create a non-rectangular shape, do not fare as well with the automatic approach. The system will add a background layer of its own choosing (some shade of gray) and then place your icon on top of it. This immediately distinguishes your app as older and not taking advantage of the new icon system.</p>

<p>Here is an example of a more complex icon with transparency and multiple colors.</p>

<p><img src="/images/2025/app_icons_complex_compare.png" alt="Complex Icon Compare" /></p>

<p>In this case, the system added a gray background layer and then squeezed the original icon on top of it. The result is a bit jarring and does not fit in with the new icon style.</p>

<p>If your icon is complex like this, then you should consider fully redesigning your icon using the new layered approach and Icon Composer app.</p>

<h2 id="option-2-use-icon-composer">Option 2: Use Icon Composer</h2>

<p>The Icon Composer app is a new tool included with Xcode 26 that allows you to create layered app icons. You can use this tool to design your icon with a background and up to four foreground layers, each with its own material properties.</p>

<p>To use it, you will want to start with your original icon and decompose it into layers. Ignore the background layer as that will be configured in the Icon Composer app. Then, create up to four additional layers for the foreground elements of your icon.</p>

<p>These layers should be simple SVG or PNG files <em>without</em>
drop shadows, transparency, or too much highlighting since those attributes can be configured in the Icon Composer app itself.</p>

<p>Let’s see it in action. First, I extracted the foreground layer and exported it as a PNG file. I then created a new icon in Icon Composer with a background gradient and this simple foreground layer.</p>

<p><img src="/images/2025/app_icons_layers_compare.png" alt="Icon Composer" /></p>

<p>On the left is my single layer (with transparency)
and on the right is the rendered icon (on macOS). There is the background gradient shaded a bit differently than before (I used the same colors from the other examples, the difference is due to the system’s rendering lighting model). The foreground layer is now sitting on top of the background with a <strong>very</strong> light drop shadow.</p>

<p>Most striking is how the foreground now gets rendered with its own specular highlights and shading.
It looks quite rounded now. These are just the defaults,
within Icon Composer you can adjust the material properties of each layer to get the look you want.</p>

<p>Here is a screenshot of the Icon Composer app itself so you can see the layers and properties.</p>

<p><img src="/images/2025/app_icons_composer.png" alt="Icon Composer App" /></p>

<p>There are a lot of options and even more environments (dark, light, mac, iOS, tinted, etc.) to preview your icon in.</p>

<h2 id="manually-compiling-your-icons">Manually Compiling Your Icons</h2>

<p>As of this writing (September 12, 2025), the .NET build system does not handle the new layered icon format.</p>

<p>However, you can manually compile your icons using the <code class="language-plaintext highlighter-rouge">actool</code> command line tool that comes with Xcode. This tool will take your <code class="language-plaintext highlighter-rouge">.icon</code> file created with Icon Composer and produce (1) a compiled Assets.car file that contains the layered icon for macOS 26 and iOS 26, and (2) a backwards compatible <code class="language-plaintext highlighter-rouge">.icns</code> file for macOS that will be used on older versions of macOS. This tool needs to be run separately for iOS and macOS.</p>

<p>This process is quite hacky, so I’d recommend waiting until the .NET build system supports this natively. But if you want to try it out, here are the steps.</p>

<h3 id="ios">iOS</h3>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Compile for iOS</span>
/Applications/Xcode.app/Contents/Developer/usr/bin/actool <span class="se">\</span>
    MyApp/MyIcon.icon <span class="nt">--app-icon</span> MyIcon <span class="se">\</span>
    <span class="nt">--compile</span> <span class="nb">.</span> <span class="se">\</span>
    <span class="nt">--output-partial-info-plist</span> assetcatalog_generated_info.plist <span class="se">\</span>
    <span class="nt">--target-device</span> iphone <span class="nt">--target-device</span> ipad <span class="nt">--minimum-deployment-target</span> 13.0 <span class="nt">--platform</span> iphoneos <span class="se">\</span>
</code></pre></div></div>

<p>This will produce 4 files:</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">MyIcon60x60@2x.png</code> - backwards compatible image for iPhones running iOS 13.0 and later</li>
  <li><code class="language-plaintext highlighter-rouge">MyIcon76x76@2x~ipad.png</code> - backwards compatible image for iPads running iOS 13.0 and later</li>
  <li><code class="language-plaintext highlighter-rouge">Assets.car</code> - the archived assets that contain the new fully layered icon for iOS 26 and beyond. This is the file you need to include in your app bundle as a resource for the new icon to be used.</li>
  <li><code class="language-plaintext highlighter-rouge">assetcatalog_generated_info.plist</code> - a plist file whose entries you should add to your app’s Info.plist file.</li>
</ol>

<p>If your app is using other Assets, then you should augment the command line to include the <code class="language-plaintext highlighter-rouge">.xcassets</code> directory. For example:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Compile for iOS with existing assets</span>
/Applications/Xcode-26-RC.app/Contents/Developer/usr/bin/actool <span class="se">\</span>
    MyApp/Assets.xcassets <span class="se">\</span>
    MyApp/MyIcon.icon <span class="nt">--app-icon</span> MyIcon <span class="se">\</span>
    <span class="nt">--compile</span> <span class="nb">.</span> <span class="se">\</span>
    <span class="nt">--output-partial-info-plist</span> assetcatalog_generated_info.plist <span class="se">\</span>
    <span class="nt">--target-device</span> iphone <span class="nt">--target-device</span> ipad <span class="nt">--minimum-deployment-target</span> 13.0 <span class="nt">--platform</span> iphoneos <span class="se">\</span>
</code></pre></div></div>

<h3 id="macos">macOS</h3>

<p>Supporting macOS is similar, but you only need to support one target device (mac) and you will get a backwards compatible <code class="language-plaintext highlighter-rouge">.icns</code> file instead of multiple PNG files.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Compile for macOS</span>
/Applications/Xcode.app/Contents/Developer/usr/bin/actool <span class="se">\</span>
    MyApp/MyIcon.icon <span class="nt">--app-icon</span> MyIcon <span class="se">\</span>
    <span class="nt">--compile</span> <span class="nb">.</span> <span class="se">\</span>
    <span class="nt">--output-partial-info-plist</span> assetcatalog_generated_info.plist <span class="se">\</span>
    <span class="nt">--minimum-deployment-target</span> 11.0 <span class="nt">--platform</span> macosx <span class="nt">--target-device</span> mac
</code></pre></div></div>

<p>This will produce 3 files:</p>

<ol>
  <li><code class="language-plaintext highlighter-rouge">MyIcon.icns</code> - backwards compatible icon for macOS 11.0 and later</li>
  <li><code class="language-plaintext highlighter-rouge">Assets.car</code> - the archived assets that contain the new fully layered icon for macOS 26. This is the file you need to include in your app bundle for the new icon to be used.</li>
  <li><code class="language-plaintext highlighter-rouge">assetcatalog_generated_info.plist</code> - a plist file that you can ignore (just contains the icon name).</li>
</ol>

<h3 id="including-in-your-app">Including in Your App</h3>

<p>Once you have the compiled <code class="language-plaintext highlighter-rouge">Assets.car</code> file for iOS and/or macOS, you need to include it in your app bundle as a resource (along with the <code class="language-plaintext highlighter-rouge">.png</code> and <code class="language-plaintext highlighter-rouge">.icns</code> files).</p>

<p>In your <code class="language-plaintext highlighter-rouge">.csproj</code> file, you can include the files like this:</p>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;ItemGroup&gt;</span>
  <span class="nt">&lt;BundleResource</span> <span class="na">Include=</span><span class="s">"Assets.car"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;BundleResource</span> <span class="na">Include=</span><span class="s">"MyIcon.icns"</span> <span class="na">Condition=</span><span class="s">"'$(TargetFramework)' == 'net9.0-macos'"</span> <span class="nt">/&gt;</span>
  <span class="nt">&lt;BundleResource</span> <span class="na">Include=</span><span class="s">"MyIcon.png"</span> <span class="na">Condition=</span><span class="s">"'$(TargetFramework)' == 'net9.0-ios'"</span> <span class="nt">/&gt;</span>
<span class="nt">&lt;/ItemGroup&gt;</span>
</code></pre></div></div>

<p>You also need to update your <code class="language-plaintext highlighter-rouge">Info.plist</code> file adding whatever entries were generated in the <code class="language-plaintext highlighter-rouge">assetcatalog_generated_info.plist</code> file.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Apple’s new layered app icon system in iOS and macOS 26 offers a great opportunity to make your app stand out with dynamic and visually appealing icons. You can choose to do nothing and let the system generate a layered icon for you, or you can use the Icon Composer app to create a custom layered icon. For the adventurous, you can manually compile your icons using the <code class="language-plaintext highlighter-rouge">actool</code> command line tool. Whichever approach you choose, updating your app icons will help ensure your app looks great on all devices and fits in with the latest design trends.</p>]]></content><author><name></name></author><category term="article" /><summary type="html"><![CDATA[TL;DR Apple has introduced a new layered approach to app icons in iOS and macOS 26. This post explains how to update your .NET and MAUI app icons to comply with the new requirements and ensure your app looks great on all devices. I outline two different approaches from easiest (do nothing) to hardest (composing and compiling your icons). I also outline the steps needed to manually compile your icons using the actool command line tool.]]></summary></entry><entry><title type="html">Building and Deploying iOS MAUI Apps with GitHub Actions</title><link href="https://praeclarum.org/2025/08/19/maui-cicd.html" rel="alternate" type="text/html" title="Building and Deploying iOS MAUI Apps with GitHub Actions" /><published>2025-08-19T00:00:00+00:00</published><updated>2025-08-19T00:00:00+00:00</updated><id>https://praeclarum.org/2025/08/19/maui-cicd</id><content type="html" xml:base="https://praeclarum.org/2025/08/19/maui-cicd.html"><![CDATA[<p><strong>TL;DR</strong> Here are the basic steps to set up CI/CD for your iOS MAUI apps using GitHub Actions! I outline every step needed to setup the build environment, import your certificates and provisioning profiles, build, and, finally, upload your iOS apps to Test Flight and the App Store.
I leave the macOS, Android, and Windows versions as an exercise for the reader.</p>

<h2 id="the-challenge-of-maui-cicd">The Challenge of MAUI CI/CD</h2>

<p>MAUI (Multi-platform App UI) is a powerful framework for building cross-platform applications, but setting up Continuous Integration and Continuous Deployment (CI/CD) can be a bit tricky, especially for iOS apps. The process involves setting up the build server to have all the right Xcode and .NET versions,
actually building the app, signing it, and then distributing it to Test Flight and the App Store.</p>

<p>This blog post is a lot longer than I would like it to be, but the good news is that once you have a good CI/CD script running, it’s pretty stable and you can reuse it for all your MAUI apps. So, let’s dive in!</p>

<h2 id="setting-up-the-job">Setting Up the Job</h2>

<p>First, you need to setup the job to have the correct versions of Xcode and .NET. This is crucial for building your MAUI app correctly. Let’s start with locking down the macOS version and the Xcode version. You can use the <code class="language-plaintext highlighter-rouge">maxim-lobanov/setup-xcode</code> action to specify the Xcode version you want to use.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">jobs</span><span class="pi">:</span>
  <span class="na">build</span><span class="pi">:</span>
    <span class="na">name</span><span class="pi">:</span> <span class="s">Build iOS</span>
    <span class="na">runs-on</span><span class="pi">:</span> <span class="s">macos-15</span>
    <span class="na">timeout-minutes</span><span class="pi">:</span> <span class="m">45</span>

    <span class="na">env</span><span class="pi">:</span>
      <span class="na">DOTNET_CLI_TELEMETRY_OPTOUT</span><span class="pi">:</span> <span class="m">1</span>
      <span class="na">DOTNET_VERSION</span><span class="pi">:</span> <span class="s2">"</span><span class="s">net9.0"</span>

    <span class="na">steps</span><span class="pi">:</span>

    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Checkout Code</span>
      <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v4</span>
      <span class="na">with</span><span class="pi">:</span>
        <span class="na">submodules</span><span class="pi">:</span> <span class="no">true</span>

    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Set Xcode</span>
      <span class="na">uses</span><span class="pi">:</span> <span class="s">maxim-lobanov/setup-xcode@v1</span>
      <span class="na">with</span><span class="pi">:</span>
        <span class="na">xcode-version</span><span class="pi">:</span> <span class="s2">"</span><span class="s">16.3"</span>
</code></pre></div></div>

<p>Here, I have specified macOS 15 and Xcode 16.3. You can adjust these versions based on your requirements.</p>

<p>I have also set a timeout of 45 minutes for the job, because macOS
jobs are <strong>very</strong> slow and are also <strong>very</strong> expensive to run. You want to prevent runaway jobs from costing you a fortune.</p>

<p>I set the <code class="language-plaintext highlighter-rouge">DOTNET_CLI_TELEMETRY_OPTOUT</code> environment variable to <code class="language-plaintext highlighter-rouge">1</code> to disable telemetry, which is a good practice for CI/CD environments.</p>

<p>Lastly, I set the <code class="language-plaintext highlighter-rouge">DOTNET_VERSION</code> environment variable to <code class="language-plaintext highlighter-rouge">net9.0</code> because it is repeated throughout these steps (and in build paths) and
I like to minimize the things I need to change when updating .NET versions.</p>

<p>Now it’s time to install the .NET SDK. This is a two-step process:</p>

<ol>
  <li>Install the .NET SDK using the <code class="language-plaintext highlighter-rouge">actions/setup-dotnet</code> action.</li>
  <li>Install the workloads needed for your MAUI app.</li>
</ol>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Setup .NET</span>
      <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/setup-dotnet@v4</span>
      <span class="na">with</span><span class="pi">:</span>
        <span class="na">global-json-file</span><span class="pi">:</span> <span class="s">global.json</span>

    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Install Workloads</span>
      <span class="na">run</span><span class="pi">:</span> <span class="s">dotnet workload restore MyApp.sln</span>
</code></pre></div></div>

<p>Here, I reference the <code class="language-plaintext highlighter-rouge">global.json</code> file to ensure the correct .NET SDK version is used. If you do not use a <code class="language-plaintext highlighter-rouge">global.json</code> file (why aren’t you?), specify the .NET SDK version directly in the <code class="language-plaintext highlighter-rouge">dotnet-version</code> input of the <code class="language-plaintext highlighter-rouge">actions/setup-dotnet</code> action.</p>

<h2 id="install-certificates-and-provisioning-profiles">Install Certificates and Provisioning Profiles</h2>

<p>To build and sign your iOS app, you need to install the necessary certificates and provisioning profiles. This is a crucial step for iOS apps, as they require proper signing to run on devices and be distributed via Test Flight or the App Store.</p>

<p>The trick is to store your certificate as a GitHub secret. You’ll then restore that certificate to the keychain and then you’ll be able to automatically download the provisioning profile from Apple.</p>

<p>I use the <code class="language-plaintext highlighter-rouge">apple-actions/import-codesign-certs</code> action to import the certificate. It requires 2 things:</p>

<ol>
  <li>The base64-encoded P12 file of your certificate. I store this in a GitHub secret named <code class="language-plaintext highlighter-rouge">APPSTORE_CERTIFICATE_P12</code>.</li>
  <li>The password for the P12 file. I store this in a GitHub secret named <code class="language-plaintext highlighter-rouge">APPSTORE_CERTIFICATE_P12_PASSWORD</code>.</li>
</ol>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Import Apple Certificate</span>
      <span class="na">uses</span><span class="pi">:</span> <span class="s">apple-actions/import-codesign-certs@v4</span>
      <span class="na">with</span><span class="pi">:</span>
        <span class="na">create-keychain</span><span class="pi">:</span> <span class="no">true</span>
        <span class="na">keychain-password</span><span class="pi">:</span> <span class="s">${{ secrets.APPSTORE_CERTIFICATE_P12_PASSWORD }}</span>
        <span class="na">p12-file-base64</span><span class="pi">:</span> <span class="s">${{ secrets.APPSTORE_CERTIFICATE_P12 }}</span>
        <span class="na">p12-password</span><span class="pi">:</span> <span class="s">${{ secrets.APPSTORE_CERTIFICATE_P12_PASSWORD }}</span>
</code></pre></div></div>

<p>I use the same password for the keychain and the P12 file because I’m often using the same job to build macOS, iOS, and Catalyst apps and sharing the same keychain is convenient.</p>

<p>To generate the P12 file, you can use the <code class="language-plaintext highlighter-rouge">Keychain Access</code> app on your Mac. Export your certificate as a P12 file and save it somewhere with a password. Then, encode it to base64, which you can do with the following command:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">base64</span> <span class="nt">-i</span> <span class="s1">'MyAppleCertificate.p12'</span> | pbcopy
</code></pre></div></div>

<p>That will copy the base64-encoded string to your clipboard, which you can then paste into your GitHub secret. Make sure to also set the password for the P12 file as a GitHub secret.</p>

<p>Now we need to download the provisioning profile from Apple. This is done using the <code class="language-plaintext highlighter-rouge">apple-actions/download-provisioning-profiles</code> action. You need to provide the App ID and the Team ID for your Apple Developer account.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Download Provisioning Profile</span>
      <span class="na">uses</span><span class="pi">:</span> <span class="s">apple-actions/download-provisioning-profiles@v4</span>
      <span class="na">with</span><span class="pi">:</span> 
        <span class="na">bundle-id</span><span class="pi">:</span> <span class="s1">'</span><span class="s">com.example.myapp'</span>
        <span class="na">profile-type</span><span class="pi">:</span> <span class="s1">'</span><span class="s">IOS_APP_STORE'</span>
        <span class="na">issuer-id</span><span class="pi">:</span> <span class="s">${{ secrets.APPSTORE_ISSUER_ID }}</span>
        <span class="na">api-key-id</span><span class="pi">:</span> <span class="s">${{ secrets.APPSTORE_KEY_ID }}</span>
        <span class="na">api-private-key</span><span class="pi">:</span> <span class="s">${{ secrets.APPSTORE_PRIVATE_KEY }}</span>
</code></pre></div></div>

<p>(Set the <code class="language-plaintext highlighter-rouge">bundle-id</code> to match your app’s bundle identifier.)</p>

<p>The <code class="language-plaintext highlighter-rouge">APPSTORE_ISSUER_ID</code>, <code class="language-plaintext highlighter-rouge">APPSTORE_KEY_ID</code>, and <code class="language-plaintext highlighter-rouge">APPSTORE_PRIVATE_KEY</code> are also GitHub secrets that you need to set up. You can generate these from your Apple Developer account.</p>

<p>Go to <a href="https://appstoreconnect.apple.com/access/integrations/api">https://appstoreconnect.apple.com/access/integrations/api</a> to create an API key that you will use for GitHub Actions. This will give you the <code class="language-plaintext highlighter-rouge">APPSTORE_ISSUER_ID</code>, <code class="language-plaintext highlighter-rouge">APPSTORE_KEY_ID</code>, and the private key that you need to store as a GitHub secret.</p>

<p>Once all those secrets are set up, you can run the job and it will import the certificate and download the provisioning profile automatically.</p>

<h2 id="build-the-ios-app">Build the iOS App</h2>

<p>This is the easy part. Take a stretch. Have some coffee. You’ve earned it.</p>

<p>All you need to do is run the <code class="language-plaintext highlighter-rouge">dotnet publish</code> command and pass it the <strong>project file</strong> of your MAUI app. Do not pass the solution as the build process is designed to work with the project file directly and will otherwise try to publish every project in the solution.</p>

<p>There are a few important flags to pass to the <code class="language-plaintext highlighter-rouge">dotnet publish</code> command:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">-c Release</code>: This specifies that you want to build the app in Release mode.</li>
  <li><code class="language-plaintext highlighter-rouge">-f $-ios</code>: This specifies the target framework for iOS. The <code class="language-plaintext highlighter-rouge">env.DOTNET_VERSION</code> variable is set to <code class="language-plaintext highlighter-rouge">net9.0</code> in the <code class="language-plaintext highlighter-rouge">env</code> section of the job, so it will resolve to <code class="language-plaintext highlighter-rouge">net9.0-ios</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">-p:ArchiveOnBuild=true</code>: This tells the build process to create an archive of the app, which is necessary for distribution.</li>
  <li><code class="language-plaintext highlighter-rouge">-p:RuntimeIdentifier=ios-arm64</code>: This specifies the runtime identifier for iOS.</li>
  <li><code class="language-plaintext highlighter-rouge">"/p:CodesignKey=\"Apple Distribution: My Awesome Company, Inc. (XXX12AB34C)\""</code>: This specifies the code signing key to use for signing the app. Replace this with your own code signing key name that you can see during the key import step. Escaping the quotes is necessary to ensure the command is parsed correctly and is a little insanity making, but it works.</li>
</ul>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Build</span>
      <span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
        <span class="s">dotnet publish -c Release -f ${{env.DOTNET_VERSION}}-ios -p:ArchiveOnBuild=true -p:RuntimeIdentifier=ios-arm64 "/p:CodesignKey=\"Apple Distribution: My Awesome Company, Inc. (XXX12AB34C)\"" MyApp/MyApp.csproj</span>
</code></pre></div></div>

<p>For details on all the wonderful options you can pass to the <code class="language-plaintext highlighter-rouge">dotnet publish</code> command, see the official documentation: <a href="https://learn.microsoft.com/en-us/dotnet/maui/ios/deployment/publish-cli?view=net-maui-9.0&amp;source=recommendations">Publish an iOS app using the command line</a>.</p>

<p>Assuming you were a good developer, paid your taxes, and pass the karma test, this will build your app and create an <code class="language-plaintext highlighter-rouge">.ipa</code> file in the <code class="language-plaintext highlighter-rouge">bin/Release/$-ios/ios-arm64/publish</code> directory of your MAUI project.</p>

<h2 id="upload-that-puppy-to-test-flight">Upload that Puppy to Test Flight</h2>

<p>Finally, we need to upload the built <code class="language-plaintext highlighter-rouge">.ipa</code> file to Test Flight. This is done using the <code class="language-plaintext highlighter-rouge">apple-actions/upload-testflight-build</code> action.</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code>    <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Upload to TestFlight</span>
      <span class="na">uses</span><span class="pi">:</span> <span class="s">apple-actions/upload-testflight-build@v1</span>
      <span class="na">with</span><span class="pi">:</span>
        <span class="na">app-type</span><span class="pi">:</span> <span class="s">ios</span>
        <span class="na">app-path</span><span class="pi">:</span> <span class="s1">'</span><span class="s">MyApp/bin/Release/${{env.DOTNET_VERSION}}-ios/ios-arm64/publish/MyApp.ipa'</span>
        <span class="na">issuer-id</span><span class="pi">:</span> <span class="s">${{ secrets.APPSTORE_ISSUER_ID }}</span>
        <span class="na">api-key-id</span><span class="pi">:</span> <span class="s">${{ secrets.APPSTORE_KEY_ID }}</span>
        <span class="na">api-private-key</span><span class="pi">:</span> <span class="s">${{ secrets.APPSTORE_PRIVATE_KEY }}</span>
</code></pre></div></div>

<p>You were worried we were going to have to create more secrets weren’t you? Don’t lie. I know you were. Good news! You can use the same <code class="language-plaintext highlighter-rouge">APPSTORE_ISSUER_ID</code>, <code class="language-plaintext highlighter-rouge">APPSTORE_KEY_ID</code>, and <code class="language-plaintext highlighter-rouge">APPSTORE_PRIVATE_KEY</code> secrets that you used to download the provisioning profile.</p>

<p>This action will upload the <code class="language-plaintext highlighter-rouge">.ipa</code> file to Test Flight, where you can then test your app before releasing it to the App Store.</p>

<h2 id="conclusion">Conclusion</h2>

<p>You are now ready to build and deploy your MAUI apps using GitHub Actions! Pat yourself on the back, that wasn’t easy. But the good news is that once you have this setup, you can reuse it for all your MAUI apps. Just make sure to adjust the bundle identifier and the project file path in the <code class="language-plaintext highlighter-rouge">dotnet publish</code> command.</p>

<p>Now go! Continuously integrate. Deploy continuously. And may your MAUI apps be bug-free and loved by users everywhere!</p>]]></content><author><name></name></author><category term="article" /><summary type="html"><![CDATA[TL;DR Here are the basic steps to set up CI/CD for your iOS MAUI apps using GitHub Actions! I outline every step needed to setup the build environment, import your certificates and provisioning profiles, build, and, finally, upload your iOS apps to Test Flight and the App Store. I leave the macOS, Android, and Windows versions as an exercise for the reader.]]></summary></entry><entry><title type="html">I Built the World’s Largest Translated Cuneiform Corpus using AI</title><link href="https://praeclarum.org/2023/06/09/cuneiform.html" rel="alternate" type="text/html" title="I Built the World’s Largest Translated Cuneiform Corpus using AI" /><published>2023-06-09T00:00:00+00:00</published><updated>2023-06-09T00:00:00+00:00</updated><id>https://praeclarum.org/2023/06/09/cuneiform</id><content type="html" xml:base="https://praeclarum.org/2023/06/09/cuneiform.html"><![CDATA[<p><strong>TL;DR</strong> I used a custom-trained Large Language Model (T5) to
create the world’s largest online corpus of translated cuneiform
texts. It’s called the <a href="https://aicuneiform.com">AICC (AI Cuneiform Corpus)</a> and contains
130,000 AI translated texts from the <a href="https://cdli.ucla.edu/">CDLI</a>
and <a href="http://oracc.org/">ORACC</a> projects.</p>

<h2 id="cuneiform">Cuneiform</h2>

<p>Cuneiform is the oldest known writing system. It was used in
Mesopotamia (modern day Iraq) for over 3,000 years. It was used to
write Sumerian, Akkadian, and other languages. Written on clay,
it has survived the millennia and is now being translated by
scholars around the world.</p>

<p>Sadly, we have more clay tablets than scholars.
Fortunately, we have computers.</p>

<h2 id="introducing-the-aicc">Introducing the AICC</h2>

<p>I’m proud to introduce the <a href="https://aicuneiform.com">AICC</a> - a collection of 130,000
cuneiform texts translated from ancient Sumerian and Akkadian
to English using a neural network. It is the largest collection
of translated cuneiform texts in the world.</p>

<p>This is the 2nd edition of the translated corpus I released last summer.
The 1st edition contained about 30,000 texts but this new edition
boasts 130,000 texts. The corpus is growing fast!</p>

<p>How good are the translations? Well, they’re decent. :-) I hope you’ll
go browse the site and see for yourself.</p>

<p>Judging the quality of cuneiform translations has a rich history.
Indulge me in a story.</p>

<h3 id="can-it-translate-tiglath-pileser">Can it Translate Tiglath-Pileser?</h3>

<p>In 1857 a new cylinder inscribed with cuneiform text
and the name Tiglath-Pileser was found (dated 1150 BC).
At this time, cuneiform was just being relearned and
there was a question as to how good various translation methods were.</p>

<p>The Royal Asiatic Society decided to perform an experiment
(that was later published as the book
<a href="https://books.google.com/books/about/Inscription_of_Tiglath_Pileser_I_King_of.html?id=GGI-AAAAcAAJ">Inscription of Tiglath Pilser I., King of Assyria</a>).
They would give the same inscription to three different translators
and see how well they did. The idea was that if the translations
were similar, then the current understanding of cuneiform was also good.</p>

<p>Here is a page from the book showing the beginning of two of the three translations:</p>

<p><img src="/images/2023/tiglath-pileser.jpg" alt="Tiglath-Pileser Translations" /></p>

<p>We can see that although the translations are different,
they convey the same meaning. The experiment was a success.</p>

<p>I am pleased to now add my own (well, my AI’s) translation to the mix.
Just one problem, my corpus contains <a href="https://aicuneiform.com/search?q=tigl">many objects</a>
with similar inscriptions and I’m not sure which one is this specific Tiglath-Pileser cylinder.
There’s 
<a href="https://aicuneiform.com/p/p393.html#P393923">P393923</a>,
<a href="https://aicuneiform.com/p/p463.html#P463064">P463064</a> (newly translated),
<a href="https://aicuneiform.com/p/p463.html#P463510">P463510</a> (newly translated),
<a href="https://aicuneiform.com/p/p467.html#P467316">P467316</a> (newly translated),
<a href="https://aicuneiform.com/p/q005.html#Q005926">Q005926</a>,
<a href="https://aicuneiform.com/p/q006.html#Q006021">Q006021</a> and more.</p>

<p>I decided in fairness and in the spirit of the competition to translate
one that had no previous translation. I chose <a href="https://aicuneiform.com/p/p467.html#P467316">P467316</a>
as its beginning seemed to match the other translations.</p>

<h3 id="ai-translation-of-tiglath-pileser">AI Translation of Tiglath-Pileser</h3>

<div style="background-color:#fff;color:#000;padding:1em 4em;font-family:georiga serif;border-radius:0.25em">
<p style="text-align:center"><em>AI Translation.</em></p>
<p>Ashur, great lord, who makes the totality of the gods submit, who gives scepter and crown, who establishes kingship; Enlil, lord, king of all the Anunnaku gods, father of the gods, lord of the lands; Sin, wise one, lord of the crown, saqu-worthy; Magur, Shamash, judge of heaven and underworld, who carries out the slander of the enemy, who breaks up the foe; Adad, hero, conqueror of the four quarters of the lands, the four quarters; Ninurta, hero, villainous sacrificial spirit and enemy, who destroys the heart of the people; Ishtar, foremost among the gods, lady of battle;
</p>
<p>
The great gods, who make the heavens and earth a lordly place, whose utterances are a scepter and a scepter, who make kingship supreme, Tiglath-Pileser, beloved prince, your beloved, your shepherd, who by your true heart you have entrusted to me, this exalted one, you have established for the sovereignty of the land of the great Enlil, you have granted him a scepter.
</p>
</div>

<p>Continued in <a href="https://aicuneiform.com/p/p467.html#P467316">P467316</a>.</p>

<p>While stilted in places, it is a decent translation, and I deem this experiment a success!</p>

<h2 id="why-ai-translations">Why AI Translations?</h2>

<p>Existing online repositories (<a href="https://cdli.ucla.edu/">CDLI</a>, <a href="http://oracc.org/">Oracc</a>)
contain many <em>transliterations</em> of ancient cuneiform texts
(a transliteration is a rewriting of a text from one writing system
to another without changing the language), but they are very lacking
in the <em>translations</em> department.</p>

<p>While I am not a cuneiform expert, I am an expert at neural networks
and have a deep passion for languages and writing systems.
I want any person to have access to the archives of
the ancients.
A grandiose goal for sure, but also a very achievable one thanks to
modern engineering advancements.</p>

<h4 id="sumerian">Sumerian</h4>

<p>Consider Sumerian (spoken by the creators of cuneiform).
There are currently 103,075 texts published with
transliterations from cuneiform symbols to (mostly) latin letters.
But only 4,583 of these texts have publicly available translations online.
That is a mere 4% of texts available to a lay person such as myself.</p>

<table>
  <thead>
    <tr>
      <th>Publications</th>
      <th>Count</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Transliterated</td>
      <td>103,075</td>
    </tr>
    <tr>
      <td>Translated</td>
      <td>4,583</td>
    </tr>
    <tr>
      <td>Need Translations</td>
      <td><strong>98,492</strong></td>
    </tr>
  </tbody>
</table>

<p>Given the existing transliterations, there are 98,492 works
that can be translated but have not yet been.</p>

<p>(There are more translations than these, but the others are not freely
available and are held under copyright. In other words, you need to go 
by a book to read them.)</p>

<p>Things aren’t much better for Akkadian (the language spoken by the famous Sargon
and Ashurbanipal).</p>

<h4 id="akkadian">Akkadian</h4>

<table>
  <thead>
    <tr>
      <th>Publications</th>
      <th>Count</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Transliterated</td>
      <td>31,747</td>
    </tr>
    <tr>
      <td>Translated</td>
      <td>10,069</td>
    </tr>
    <tr>
      <td>Need Translations</td>
      <td><strong>21,678</strong></td>
    </tr>
  </tbody>
</table>

<p>We can see that 21,678 works are all set to be translated but have
not been.</p>

<h2 id="training-a-large-language-model">Training a Large Language Model</h2>

<p>The modern advancement of large language models (LLMs) has affected and
will continue to affect nearly every human endeavor.</p>

<p>The current architecture that is heralding this new age of knowledge is the Transformer architecture. It was designed specifically to be very good at translating text from one language to another using the innovative “attention mechanism”. It’s a little funny that this network designed for translation is now broaching the realm of artificial general intelligence (AGI), but I digress.</p>

<p>Ignoring the absurdly large LLMs that are dominating the field now (GPT-4 and friends), the humble smaller transformers are still quite powerful and have made the problem of translation a somewhat trivial.</p>

<p>My favorite one of these is the T5 network from Google. While large itself, it is capable of being trained using off-the-shelf (though expensive) GPUs. If you can build a large training set, you can 
train this network at home to accomplish wonders.</p>

<p>Knowing this I set about building a training
set that the network could use to learn 
these ancient languages.</p>

<h3 id="building-the-dataset">Building the Dataset</h3>

<p>Thankfully there has been a push to digitize acquired artifacts and to publish their cuneiform on the web.</p>

<p>The two great projects are the <a href="https://cdli.ucla.edu/">CDLI (Cuneiform Digital Library Initiative)</a> and <a href="http://oracc.org/">Oracc</a>. I owe a large debt
to these projects.</p>

<p>As any machine learning expert will tell you, 90% of the problem is collecting a good training dataset (the other 10% is justifying the compute bill). Building the cuneiform dataset presented its own unique set of challenges.</p>

<h4 id="inconsistent-transliterations">Inconsistent Transliterations</h4>

<p>Sadly, Assyriologists took some time to settle on a consistent transliteration system. When works were first transliterated to a digital form, only ASCII characters were available and the researchers made due using funny characters like # to denote demonstratives, numbers to disambiguate symbols, and ALL CAPS whenever they were in the mood (just kidding, but the use is so random it might as well be).</p>

<p>When other character encodings became available, researchers adapted. They started to use diacritic marks to disambiguate symbols (loosely based on guessed sounds). And then HTML was invented and they went wild with special marks attempting to better capture the original writing.</p>

<p>While neural networks are powerful and can certainly handle these inconsistencies, it’s not ideal. If you want the network to properly learn the language it’s best not to distract it with also learning the histrionics of human computer interface systems.</p>

<p>A wrote a variety of cuneiform and english normalizers to help with this problem. They’re not perfect, but they do a decent job.</p>

<h4 id="paragraph-wrapping-and-unwrapping">Paragraph Wrapping and Unwrapping</h4>

<p>Cuneiform texts are usually written line by line in a column and are read
from top to bottom.</p>

<p>These lines are often short and, when translated, contain even fewer words.
If I train the network on just these lines (and, surprise, I did for the 1st edition),
the translations it produces are also short and choppy. They’re not great.</p>

<p>To work around this problem, I automatically “unwrap” lines into paragraphs
to be translated all together. This way the network can learn to translate
longer sentences and paragraphs.</p>

<p>The network, however, has its own limitations and can only translate
sentences up to 512 tokens long. To work around this problem,
I “wrap” the paragraphs into chunks of up to 512 tokens and translate
those. I then stitch the translations back together to form the final
translation.</p>

<p>This “unwrap” then “wrap” process is not perfect and can lead to
some strange translations, but it’s better than the alternative
of just translating single lines.</p>

<h3 id="training-process">Training Process</h3>

<p>I started with a pre-trained a <a href="https://huggingface.co/t5-base">T5 base model from Hugging Face</a> and fine-tuned it on my dataset.
This model has 220 million parameters and is capable of translating 
sequences of up to 512 tokens.</p>

<p>I trained it on a dataset of 210,247 translation examples
for 30 epochs. It took about 48 hours on my RTX3090.</p>

<p>While starting with a pre-trained model saves me a lot of compute time,
it has drawbacks. The pre-trained model was trained to translate
from English to French or German. Ideally, I would have a model
that was pre-trained to translate <em>to</em> English.</p>

<p>Also, I used its default tokenizer which does not support all the
characters I need and performs poorly on the transliterated cuneiform.</p>

<h4 id="learning-sumerian-and-akkadian-simultaneously">Learning Sumerian and Akkadian Simultaneously</h4>

<p>Since my datasets are small in size, I decided to combine learning
Sumerian and Akkadian simultaneously. This has the benefit of
increasing the training size and exposing the network to more
cuneiform symbols. Interestingly, Akkadian often uses some Sumerian
intermixed with its own language so it’s not a bad idea to train
on both.</p>

<h4 id="bidirectional-translation">Bidirectional Translation</h4>

<p>The network was having a hard time converging on a good solution.
It would train well enough for many epochs, and then it would
fall apart.</p>

<p>I found a regularization strategy that helped a lot. I would
train it to also translate <em>from</em> English to Sumerian and Akkadian.
Doing this helped the network to always converge.
I assume this is an affect of using the pre-trained network.</p>

<p>While translating from English to Akkadian or Sumerian is not
a useful task, it is a “fun party trick” as my friend put it.</p>

<h2 id="future-work">Future Work</h2>

<p>I want to continue to improve the translations and hope to take these
steps in the future:</p>

<ol>
  <li>Fine-tune the model for specific translation tasks like Akkadian to English.</li>
  <li>Pre-train a new model from scratch using a better tokenizer.</li>
  <li>Train a larger model like T5 large.</li>
  <li>Add more training data.</li>
</ol>

<h2 id="conclusion">Conclusion</h2>

<p>I hope you enjoyed this deep dive into neural networks and ancient languages.</p>

<p>When I started this project, I had no idea whether it would work or not.
I was delighted that it did, and I am extremely delighted
to be able to introduce the <a href="https://aicuneiform.com">AICC</a> to the world. Now amateur Assyriologists
like myself can read and read to their heart’s content.</p>

<p><strong>Side note:</strong> If you are an academic and would like to collaborate on this project,
please reach out to me by <a href="https://github.com/praeclarum/CuneiformTranslators/issues/new">filing issues on GitHub</a>. I have a million questions about cuneiform
that I would love to ask you.</p>]]></content><author><name></name></author><category term="article" /><summary type="html"><![CDATA[TL;DR I used a custom-trained Large Language Model (T5) to create the world’s largest online corpus of translated cuneiform texts. It’s called the AICC (AI Cuneiform Corpus) and contains 130,000 AI translated texts from the CDLI and ORACC projects.]]></summary></entry><entry><title type="html">How I Re-implemented PyTorch for WebGPU</title><link href="https://praeclarum.org/2023/05/19/webgpu-torch.html" rel="alternate" type="text/html" title="How I Re-implemented PyTorch for WebGPU" /><published>2023-05-19T00:00:00+00:00</published><updated>2023-05-19T00:00:00+00:00</updated><id>https://praeclarum.org/2023/05/19/webgpu-torch</id><content type="html" xml:base="https://praeclarum.org/2023/05/19/webgpu-torch.html"><![CDATA[<p><strong>TL;DR</strong> I’ve been working on a WebGPU optimized inference and autograd library called <a href="https://github.com/praeclarum/webgpu-torch">webgpu-torch</a> with an API that matches PyTorch. The goal is to run neural networks in the browser at speeds comparable to a Linux workstation. Many kernels have been implemented and its design is easily extensible. It’s <a href="https://www.npmjs.com/package/webgpu-torch">available on NPM now</a> and works in both the browser and Node.js!</p>

<h2 id="neural-networks-in-the-browser">Neural Networks in the Browser</h2>

<p><a href="https://github.com/praeclarum/transformers-js">Nine months ago</a>, I got Hugging Face Transformers (Large Language Models like GPT but a wee bit smaller) working in the browsers thanks to the ONNX web runtime and some painfully hand-coded tokenizers.</p>

<p>It’s quite liberating running these nets in the browser since the web is the best software distribution platform ever created. You can just send someone a link and they can run your code. No need to install anything. No need to worry about what OS they’re running. No need to worry about what hardware they have. It’s all just there.</p>

<p>The only problem is that ONNX is a wee bit, shall we say, slow.</p>

<p>Thankfully, WebGPU has arrived in browsers and we can now properly access the GPU to write optimized kernels for neural network operations. This is a huge deal. It means we can now run neural networks in the browser at speeds comparable to NVIDIA/CUDA.</p>

<p>Someone just needs to, you know, do the hard work of implementing all those operations for the GPU.</p>

<p>Well that’s what I’m very pleased to announce I’ve been working on for the past few months. I’ve been re-implementing PyTorch in TypeScript for WebGPU.</p>

<h2 id="what-is-a-pytorch">What is a PyTorch?</h2>

<p>PyTorch is a wrapper over the torch runtime (which I first used with Lua) for performing neural network operations. It’s a very popular library for doing AI work and seems to have won the arms race for now.</p>

<p>The library is broken up into parts:</p>

<ol>
  <li>
    <p>An optimized (for GPU) math library supporting element-wise operations, matrix multiplication, convolutions, reductions, etc. over tensors.</p>
  </li>
  <li>
    <p>An automatic differentiation library (autograd) that is just a lot of bookkeeping to keep track of the operations performed on tensors so that gradients can be calculated.</p>
  </li>
  <li>
    <p>A neural network library that is just a bunch of layers that can be composed together to form a neural network.</p>
  </li>
</ol>

<p>Doesn’t sound so hard to re-implement right? And so I did.</p>

<h2 id="what-is-a-webgpu">What is a WebGPU?</h2>

<p>WebGPU is the new standard for accessing GPUs from the browser. It supports generic compute shaders and is designed to be a low level API that can be used to build higher level libraries. The compute shaders are able to break work up into a 3D grid and, so long as you can reformulate your code to take advantage of that 3D grid, you can benefit from dedicated hardware doing the computations.</p>

<p>This is perfect for the web since JavaScript is single-threaded and not optimized for doing heavy computation. The GPU is a perfect fit for this since it’s designed to do heavy computation in parallel.</p>

<h2 id="writing-optimized-webgpu-kernels">Writing Optimized WebGPU Kernels</h2>

<p>PyTorch is very mature now and supports a huge variety of operations. It’s also very well optimized for CUDA and CUDNN (NVIDIA’s compute libraries). So how do you go about re-implementing all of those for WebGPU?</p>

<p>Well, you start with the basics. You implement the basic operations like element-wise operations, matrix multiplication, convolutions, reductions, etc. But there is a tremendous amount of similarity between these operations.</p>

<p>For example, element-wise multiplication and addition only vary by the operator used in the inner loop. The trick is to optimize the memory layout and kernels of those operations so they are fast. They need to adapt to big and small GPUs and they need to adapt to big and small workloads.</p>

<p>This is a perfect scenario to take advantage of code generation. I wrote a code generator that takes a template and generates the optimized kernels for each operation. The code generator is written in TypeScript and generates WebGPU compute shader code. This means that the generated code can be heavily optimized for the given scenario and those optimizations can be shared between operations.</p>

<p>For example, here is how I define the <code class="language-plaintext highlighter-rouge">ReLU</code> operation (from <code class="language-plaintext highlighter-rouge">op_table.ts</code>):</p>

<div class="language-typescript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span>
    <span class="nl">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">relu</span><span class="dl">"</span><span class="p">,</span>
    <span class="nx">nnName</span><span class="p">:</span> <span class="dl">"</span><span class="s2">ReLU</span><span class="dl">"</span><span class="p">,</span>
    <span class="nx">nnOp</span><span class="p">:</span> <span class="kc">true</span><span class="p">,</span>
    <span class="kd">type</span><span class="p">:</span> <span class="dl">"</span><span class="s2">unary</span><span class="dl">"</span><span class="p">,</span>
    <span class="nx">forward</span><span class="p">:</span> <span class="dl">"</span><span class="s2">output = max(input, 0.0)</span><span class="dl">"</span><span class="p">,</span>
    <span class="nx">backward</span><span class="p">:</span> <span class="dl">"</span><span class="s2">inputGrad = input &gt; 0.0 ? outputGrad : 0.0</span><span class="dl">"</span><span class="p">,</span>
<span class="p">}</span>
</code></pre></div></div>

<p>In this template I define both the forward computation <code class="language-plaintext highlighter-rouge">max(input, 0.0)</code> and the backward computation <code class="language-plaintext highlighter-rouge">input &gt; 0.0 ? outputGrad : 0.0</code>. The code generator then generates the optimized kernels for both the forward and backward passes based on the size of your GPU (the size of compute workgroups) and the shape of tensors (in addition to the memory layouts of the tensors).</p>

<p>Keeping the template short and simple gives me flexibility to optimize the kernels as needed while preserving the core logic. For example, different kernels can be emitted for contiguous memory tensors vs strided memory tensors. For operations like reductions, 1D, 2D, 3D, and xD kernels can be emitted to take advantage of the 3D workgroup grid.</p>

<p>At first I designed the template system to help me save some typing, but I quickly realized its power and now I use it for all operations.</p>

<h2 id="debugging-webgpu-kernels">Debugging WebGPU Kernels</h2>

<p>Another huge benefit came from the fact that I was generating the kernels. I could generate the kernels to not only emit WebGPU code, but also JavaScript code. The core logic gets wrapped in another function that can be called from JavaScript. This means that I can run the same code in JavaScript and WebGPU and compare the results. Even better, I can debug kernels in JavaScript and then execute them on WebGPU.</p>

<p>The JavaScript CPU kernels are terribly slow, but they’re not supposed to be fast. They instead provide a convenient playground for debugging and testing kernels.</p>

<p>This also means that my WebGPU library can also run just fine in Node.js, without WebGPU, whatever. Isn’t it great when architectural decisions keep paying off?</p>

<h2 id="testing">Testing</h2>

<p>The worst part of using a new neural network library is when it doesn’t give the exact same results as previous libraries you’ve used. One of my biggest frustrations with the WebGL ONNX backend is the fact that it gives very inaccurate results compared to PyTorch. I didn’t want that. I want full fidelity. I want to make sure all my WebGPU kernels match the results of PyTorch operations.</p>

<p>To that end, I have built a test harness that first runs code snippets in PyTorch to record results, then runs the same code snippets in my library and compares the results. If they don’t match, it throws an error.</p>

<p>This has produced a silly but fun web page to go visit. If you go to <a href="https://praeclarum.org/webgpu-torch/tests/">https://praeclarum.org/webgpu-torch/tests/</a> you will see a huge set of tests running to verify all the supported operations. It’s a great way to see what operations are supported and what the results are.</p>

<h2 id="goals">Goals</h2>

<p>I like to train imaging networks and to that end my goal is to get Stable Diffusion and similar nets running under this library. Once that’s accomplished I will focus on the many Hugging Face transformer networks. I’m hoping to get all of them running in the browser at CUDA speeds.</p>

<p>I have a set of TODOs in the README of the project. If you’re interested in helping out, please take a look!</p>]]></content><author><name></name></author><category term="article" /><summary type="html"><![CDATA[TL;DR I’ve been working on a WebGPU optimized inference and autograd library called webgpu-torch with an API that matches PyTorch. The goal is to run neural networks in the browser at speeds comparable to a Linux workstation. Many kernels have been implemented and its design is easily extensible. It’s available on NPM now and works in both the browser and Node.js!]]></summary></entry><entry><title type="html">Generating SVG Images with ChatGPT</title><link href="https://praeclarum.org/2023/04/03/chatsvg.html" rel="alternate" type="text/html" title="Generating SVG Images with ChatGPT" /><published>2023-04-03T00:00:00+00:00</published><updated>2023-04-03T00:00:00+00:00</updated><id>https://praeclarum.org/2023/04/03/chatsvg</id><content type="html" xml:base="https://praeclarum.org/2023/04/03/chatsvg.html"><![CDATA[<p><strong>TL;DR</strong> I used ChatGPT, with varying levels of success, to generate SVG images. With enough patience and prompting, you can get it to draw basic shapes and simple scenes about as well as a five year old can. (Which, hey, let’s face is, is good enough a lot of the time.)</p>

<h2 id="background">Background</h2>

<p>I like the idea of using AI’s to generate images. Go back in this blog and you’ll see lots of posts and casual references to imaging networks I like build and experiment with.</p>

<p>With recent advances like DALL-E, Stable Diffusion, and Midjourney, it seems like the static image generation problem has been solved. Sure, there’s plenty of room for growth and improvement, but the basic problem of generating images from text is solved.</p>

<p>But what about vector graphics?
As an app developer, I’m often in need of graphical resources. Images are neat, but they’re also so 1970s. They don’t scale well and are a terribly wasteful and inconvenient (for editing) encoding of graphical information.</p>

<p>Like everyone else, I’ve been using <a href="https://chat.openai.com/">ChatGPT</a> a lot recently, and I was curious to see if it could generate SVG code for images.
I know it’s pretty good at HTML, so I figured it might be able to do SVG too. It’s certainly not something it was designed to do, but I thought it would be fun experiment to conduct.</p>

<p>To my delight, it actually works. You’ll see below that the results are not always great or, um, err, good, but it’s still pretty cool and useful that it can do this at all.</p>

<p>As of GPT-4, I would say it’s capable of generating decent shapes and simple combinations of shapes. But I’ll let you you judge.</p>

<h2 id="experiments">Experiments</h2>

<p>The following transcripts are from my first experiments generating SVGs with ChatGPT. I was curious to see what kind of images the model could generate, and how it would respond to different inputs.</p>

<p>I performed the experiments using the ChatGPT API using the <code class="language-plaintext highlighter-rouge">gpt-4</code> model.</p>

<p>I used my app <a href="https://github.com/praeclarum/AskGPT">AskGPT</a> to run these prompts and dutifully pasted the responses into SVG files to make this post.</p>

<h4 id="please-generate-an-svg-of-a-catus-sic-out-in-the-desert-on-a-sunny-day-no-explanation">Please generate an SVG of a catus [<em>sic</em>] out in the desert on a sunny day, no explanation.</h4>

<blockquote>
  <p>As an AI language model, I cannot create or directly display SVG images. However, I can help you create the SVG code, which you 
could then use in any SVG viewer or compatible software/website to visualize the cactus in the desert.</p>

  <p>Here’s a simple SVG code for a cactus in a desert environment. Copy and paste this code into an SVG viewer or an HTML document 
to display the image.</p>
</blockquote>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;svg</span> <span class="na">width=</span><span class="s">"300"</span> <span class="na">height=</span><span class="s">"150"</span> <span class="na">xmlns=</span><span class="s">"http://www.w3.org/2000/svg"</span><span class="nt">&gt;</span>
    <span class="c">&lt;!-- Background color (sky) --&gt;</span>
    <span class="nt">&lt;rect</span> <span class="na">width=</span><span class="s">"300"</span> <span class="na">height=</span><span class="s">"150"</span> <span class="na">x=</span><span class="s">"0"</span> <span class="na">y=</span><span class="s">"0"</span> <span class="na">fill=</span><span class="s">"skyblue"</span> <span class="nt">/&gt;</span>
    
    <span class="c">&lt;!-- Sun --&gt;</span>
    <span class="nt">&lt;circle</span> <span class="na">cx=</span><span class="s">"270"</span> <span class="na">cy=</span><span class="s">"30"</span> <span class="na">r=</span><span class="s">"20"</span> <span class="na">fill=</span><span class="s">"yellow"</span> <span class="nt">/&gt;</span>
    
    <span class="c">&lt;!-- Ground --&gt;</span>
    <span class="nt">&lt;rect</span> <span class="na">width=</span><span class="s">"300"</span> <span class="na">height=</span><span class="s">"50"</span> <span class="na">x=</span><span class="s">"0"</span> <span class="na">y=</span><span class="s">"100"</span> <span class="na">fill=</span><span class="s">"tan"</span> <span class="nt">/&gt;</span>
    
    <span class="c">&lt;!-- Cactus --&gt;</span>
    <span class="nt">&lt;rect</span> <span class="na">width=</span><span class="s">"30"</span> <span class="na">height=</span><span class="s">"80"</span> <span class="na">x=</span><span class="s">"130"</span> <span class="na">y=</span><span class="s">"70"</span> <span class="na">fill=</span><span class="s">"green"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;rect</span> <span class="na">width=</span><span class="s">"20"</span> <span class="na">height=</span><span class="s">"30"</span> <span class="na">x=</span><span class="s">"125"</span> <span class="na">y=</span><span class="s">"50"</span> <span class="na">fill=</span><span class="s">"green"</span> <span class="nt">/&gt;</span>
    <span class="nt">&lt;rect</span> <span class="na">width=</span><span class="s">"20"</span> <span class="na">height=</span><span class="s">"30"</span> <span class="na">x=</span><span class="s">"145"</span> <span class="na">y=</span><span class="s">"50"</span> <span class="na">fill=</span><span class="s">"green"</span> <span class="nt">/&gt;</span>
<span class="nt">&lt;/svg&gt;</span>
</code></pre></div></div>

<p><img src="/images/2023/chatsvg_desert1.svg" alt="Cactus in the desert" /></p>

<p><strong>SUCCESS</strong> Although the initial part of the response was discouraging (why the low confidence ChatGPT?), the model dutifully generated the SVG code for a cactus in the desert.</p>

<p>The cactus could obviously use some work, but this image is mostly what I was expecting. I know these models are sensitive to inputs, so I decied to give it some encouragement.</p>

<h4 id="ok-great-but-please-now-generate-a-much-more-detailed-image-using-gradients-and-paths">OK, great. But please now generate a much more detailed image using gradients and paths</h4>

<blockquote>
  <p>Here is a more detailed SVG code for a cactus in the desert using gradients and paths for a richer visual representation. To 
view the image, copy and paste the code into an SVG viewer or an HTML document.</p>
</blockquote>

<p>(I’m going to elide the SVG code from now on, but you can see it by viewing the source of the images.)</p>

<p><img src="/images/2023/chatsvg_desert2.svg" alt="Cactus in the desert 2" /></p>

<p><strong>SUCCESS</strong> OK, so it’s not going to win any rewards with that, but I applaud the effort. The cactus has a bit more shape and the horizon is more interesting.</p>

<p>It looks like the model is having a hard time getting the directions of arcs correct in the SVG code. Golly, I can relate. I’ve written several SVG libraries and still I can never get the ordering right.</p>

<p>That said, I feel like I’m failing the network by not being a good prompter. Let’s try again with stupid amounts of prompting.</p>

<h4 id="please-generate-an-svg-of-a-cactus-in-the-desert-on-a-sunny-day-please-make-it-as-detailed-as-possible-use-vibrant-colors-use-dramatic-lighting-i-want-depth-i-want-to-feel-like-im-there-i-want-something-shockingly-beautiful">Please generate an SVG of a cactus in the desert on a sunny day. Please make it as detailed as possible. Use vibrant colors. Use dramatic lighting. I want depth! I want to feel like I’m there! I want something shockingly beautiful.</h4>

<blockquote>
  <p>I apologize, but as an AI text model, I cannot create images or SVG files. However, I can help you understand how you can create such 
an image using a vector graphics software like Adobe Illustrator or Inkscape.</p>

  <ol>
    <li>Open your preferred vector graphics software and create a new document.</li>
  </ol>

  <p>…</p>
</blockquote>

<p><strong>FAILURE</strong> Oops, I went too far and it just started explaining to me how to use a drawing program. I guess it doesn’t like that kind of prompting.</p>

<p>Let’s change subjects and see how it does at generating icons.</p>

<h3 id="app-ui-icons">App UI Icons</h3>

<h4 id="would-you-generate-the-svg-code-for-a-mac-toolbar-icon-that-toggles-the-selected-item-preferences-pane-on-and-off">Would you generate the SVG code for a mac toolbar icon that toggles the selected item preferences pane on and off?</h4>

<blockquote>
  <p>As an AI, I cannot generate an SVG file directly in this text interface, but I can provide you with the code that you can copy and 
paste into an SVG file, like ‘toggle_preferences_icon.svg’.</p>

  <p>Here’s a simple SVG code example for a mac toolbar icon with two gear-like symbols representing the toggling of the selected item 
preferences pane:</p>
</blockquote>

<p><img src="/images/2023/chatsvg_icon1.svg" alt="Toggle preferences icon" style="width:240px;background-color:white;" /></p>

<p><strong>FAILURE</strong> Haha, OK, um so those aren’t gears. E for effort.</p>

<p>Can we just make a single gear if we concentrate really hard?</p>

<h4 id="please-generate-an-svg-code-of-a-gear-icon-please-make-it-detailed-showing-the-teeth-and-such-you-can-use-paths-if-you-want-make-your-response-as-long-as-it-needs-to-be">Please generate an SVG code of a gear icon. Please make it detailed showing the teeth and such. You can use paths if you want. Make your response as long as it needs to be.</h4>

<blockquote>
  <p>Here’s an SVG code representation of a gear icon using paths:</p>
</blockquote>

<p><img src="/images/2023/chatsvg_icon2.svg" alt="Gear icon" style="width:240px;background-color:white;" /></p>

<p><strong>FAILURE</strong> I see gears aren’t its strong suit.</p>

<p>Let’s see if it can generate something more basic.</p>

<h4 id="please-generate-an-svg-code-of-a-house-icon-you-can-use-paths-if-you-want-make-your-response-as-long-as-it-needs-to-be">Please generate an SVG code of a house icon. You can use paths if you want. Make your response as long as it needs to be.</h4>

<p><img src="/images/2023/chatsvg_icon3.svg" alt="House icon" style="width:240px;background-color:white;" /></p>

<p><strong>SUCCESS?</strong> Do walls need to be straight? Does it matter if your roof has a little overhang? That’s a house. I’ll take it. Even better, I think an app with a “messy icon” aesthetic would be a welcome breath of fresh air.</p>

<p>Enough with boring toolbar icons. Let’s see if it can make a new icon for my app iCircuit.</p>

<h3 id="app-icons">App Icons</h3>

<h4 id="please-generate-the-svg-code-of-an-app-icon-for-icircuit-an-app-that-enables-you-to-draw-and-simulate-electronic-circuits-the-icon-should-have-a-nice-graident-background-and-fit-in-well-on-the-ios-home-screen-you-can-use-paths-if-you-want-make-your-response-as-long-as-it-needs-to-be">Please generate the SVG code of an app icon for iCircuit, an app that enables you to draw and simulate electronic circuits. The icon should have a nice graident background and fit in well on the iOS home screen. You can use paths if you want. Make your response as long as it needs to be.</h4>

<p><img src="/images/2023/chatsvg_appicon1.svg" alt="iCircuit icon" style="width:240px;background-color:white;" /></p>

<p><strong>FAILURE</strong> I dig that gradient (those colors are pretty close to the actual app icon), but the logo is a bit too… abstract. I can forgive abstractness, but I can’t forgive the fact that it’s not centered properly.</p>

<p>Let’s try prompting our way out of this hole.</p>

<h4 id="nice-i-like-that-gradient-but-the-inner-logo-is-too-weird-maybe-put-a-nice-looking-sine-wave-in-there-centered-nicely">Nice, I like that gradient. But the inner logo is too weird. Maybe put a nice looking sine wave in there. Centered nicely.</h4>

<p><img src="/images/2023/chatsvg_appicon2.svg" alt="iCircuit icon 2" style="width:240px;background-color:white;" /></p>

<p><strong>FAILURE</strong> The sin wave is so close! If I were generous, I would grade this one a “SUCCESS?”, but it failed to center again, and I just can’t have that.</p>

<h3 id="art">Art</h3>

<h4 id="please-generate-the-svg-code-of-a-nice-looking-abstract-art-piece-you-can-use-paths-if-you-want-make-your-response-as-long-as-it-needs-to-be">Please generate the SVG code of a nice looking abstract art piece. You can use paths if you want. Make your response as long as it needs to be.</h4>

<p><img src="/images/2023/chatsvg_art1.svg" alt="Abstract art" /></p>

<p><strong>SUCCESS</strong> I like it. It’s very “Saved by the Bell”, very 1990s.</p>

<p>That gives me an idea.</p>

<h4 id="please-generate-the-svg-code-of-a-spaceship-using-awesome-neon-colors-and-rad-gradients">Please generate the SVG code of a spaceship using awesome neon colors and rad gradients.</h4>

<p><img src="/images/2023/chatsvg_art2.svg" alt="Spaceship" /></p>

<p><strong>FAILURE</strong> Cool colors, for sure, but it’s not a spaceship.</p>

<p>OK, it’s not an artist. Let’s see if it can do simple shapes.</p>

<h4 id="please-generate-the-svg-code-of-a-red-circle-inside-of-that-circle-put-a-white-square-inside-that-put-the-red-text-this-is-a-warning-wrap-the-text-if-needed-use-some-gradients-to-make-it-look-good-an-intimidating">Please generate the SVG code of a red circle. Inside of that circle put a white square. Inside that, put the red text ‘THIS IS A WARNING’. Wrap the text if needed. Use some gradients to make it look good an intimidating.</h4>

<p><img src="/images/2023/chatsvg_shapes1.svg" alt="Warning" /></p>

<p><strong>SUCCESS</strong> The text isn’t centered. I get it, it’s hard. I’m expecting a lot out of a language model.</p>

<p>But look, it’s text! Readable text. That’s a big deal because imaging networks aren’t really known for their text prowess. For example, here are DALL-E and Stable Diffusion’s attempts at the same prompt:</p>

<p><img src="/images/2023/chatsvg_shapes1_dalle.jpg" alt="DALL-E" style="width:300px" /> <img src="/images/2023/chatsvg_shapes1_sd.png" alt="Stable Diffusion" style="width:300px" /></p>

<p>At least ChatGPT gave me exactly what I wanted (I’m not thinking about centering, I’m not thinking about centering, …). I got a red circle, a white box, and readable text. Neither of the other two networks could do/did that.</p>

<h3 id="conclusion">Conclusion</h3>

<p>My biggest takeways are:</p>

<ol>
  <li>Yes, ChatGPT can generate SVG code.</li>
  <li>It’s still in a very early stage and cannot handle much complexity.</li>
</ol>

<p>I was hoping for better, but for a network that was not designed to do this at all, it’s not bad. I certainly look forward to GPT-5 and 6 improving at this.</p>

<p>I keep thinking back to the last warning sign experiment.
The images generated by DALL-E and Stable Diffusion were not good. They were not even really what I wanted. ChatGPT, on the other hand, was able to generate something much closer to what I wanted.</p>

<p>I think this is a benefit of how it was trained. The engineers worked hard on the “alignment problem” (pun unintended) and it shows. It does what I expect. Old GPT-3 and friends were pure statistical generators. During their training, the only positive feedback they received was when their output sampling match language use.
ChatGPT, however, was trained with a bias towards being helpful and providing the kinds of answers people want. This was accomplished by having people hand-rank its responses.
I have a feeling that this hand-ranking, in order to aid in the human alignment problem, is going to fuel the next few years of machine learning improvements.</p>

<p>I went into this wanting to see if I could get SVGs, what I think of as a more useful format for my life as an app developer, out of ChatGPT. I got that, but I also got a lot more. I got a glimpse into the future of AI.</p>

<p>Anyway, enough armchair philosophizing. I hope you enjoyed seeing the results of my experiements, and I hope you’ll try some of your own.</p>

<h4 id="notes">Notes</h4>

<ul>
  <li>
    <p>I am aware there are neural nets specifically trained to generate SVG code. <a href="https://github.com/alexandre01/deepsvg">DeepSVG</a> is one example. Specialized networks give more reliable results today and will certainly be useful in the future. I wanted to experiment with ChatGPT because I am intrigued by its generalized knowledge and I wanted to see if that general knowledge produced interesting results.</p>
  </li>
  <li>
    <p>None of this was done scientifically. ChatGPT is still a stochastic (random) model and its outputs can vary. I should have had it generate more samples for each prompt. But I didn’t. I just wanted to see roughly what it could do.</p>
  </li>
</ul>]]></content><author><name></name></author><category term="article" /><summary type="html"><![CDATA[TL;DR I used ChatGPT, with varying levels of success, to generate SVG images. With enough patience and prompting, you can get it to draw basic shapes and simple scenes about as well as a five year old can. (Which, hey, let’s face is, is good enough a lot of the time.)]]></summary></entry><entry><title type="html">Transformers-js - Neural Networks in the Browser</title><link href="https://praeclarum.org/2022/08/25/transformers-js.html" rel="alternate" type="text/html" title="Transformers-js - Neural Networks in the Browser" /><published>2022-08-25T00:00:00+00:00</published><updated>2022-08-25T00:00:00+00:00</updated><id>https://praeclarum.org/2022/08/25/transformers-js</id><content type="html" xml:base="https://praeclarum.org/2022/08/25/transformers-js.html"><![CDATA[<p><strong>TL;DR</strong> I wrote a javascript library that lets you run modern transformer neural networks from Hugging Face 🤗 in the browser. It works on mobile browsers, desktop browsers, pretty much everywhere. Check out <a href="https://github.com/praeclarum/transformers-js">transformers-js on GitHub</a> to see how it works and checkout out the live translation demo running on my static website: <a href="https://transformers-js.praeclarum.org">https://transformers-js.praeclarum.org</a>.</p>

<p><a href="https://transformers-js.praeclarum.org"><img src="/images/2022/transformersjs.png" alt="Screenshot of the transformer-js demo translation web site" /></a></p>

<h2 id="introduction">Introduction</h2>

<p>Currently, the best way to deploy neural networks is to pay a cloud provider to host it and pay them to run inference. The more customers you have, the more you pay. It’s an old-fashioned big-iron middle-man’s utopia.</p>

<p>I’m a big fan of running neural nets on everyday hardware. It makes sense to let customers, who already invested a lot of money and carbon, use their own hardware. It’s also a huge privacy win: attackers can’t steal your information if it’s never on the network (insert Intel joke here). It’s good economically, environmentally, and it’s good for security. Sign me up.</p>

<p>Let’s fight the big-iron trend. Let’s run neural networks in the browser!</p>

<p>Announcing <strikethrough>transformers-js</strikethrough>: a library to make running translation and other language neural nets in the browser simple.</p>

<p><strong>Update (Feb 7, 2023):</strong> I’ve re-released the library as <a href="https://github.com/praeclarum/web-transformers">web-transformers</a> with full NPM and webpack support.</p>

<h2 id="hugging-face--transformers-with-transformers-js">Hugging Face 🤗 Transformers with transformers-js</h2>

<p>Transformers are neural networks that are good at manipulating serialized symbols. Ahem, sorry. By “serialized symbols” I mean language. They do language things: Sentiment analysis, summarization, translation, transmogrification. Basically, any -ation you can think of that works with a discrete set of symbols laid out one after the other.</p>

<p>And you know these networks from their friendly household names: GPT-3, Copilot, DALL-E, Stable Diffusion. There seems no end to what they can do (see also the CNN revolution of 2014).</p>

<p><a href="https://huggingface.co">Hugging Face 🤗</a> has established itself as the “GitHub of Transformers”. They have an excellent unifying framework, great documentation, and good-ish hosting. I only say good-ish hosting here because I had a demo fail because their servers were down. Clouds…</p>

<p>In fact, it was that demo fail that got me to thinking, “why can’t I just run this thing in the browser?” That thought led me to 3 days of programming. Those 3 days produced a javascript library. And that javascript library produces some kick-ass neural translations.</p>

<p>I wrote <a href="https://github.com/praeclarum/transformers-js">transformers-js</a> to make running transformers from Hugging Face 🤗 in the browser just as easy as running them in Python land. To do this, I leverage the amazing ONNX runtime in order to run the network. ONNX offers a browser-compatible runtime using WASM compiled from the complete ONNX opset code. That’s very powerful because it means that, if you can get your net running in ONNX, you can get it running in the browser. (ONNX also offers a webgl backend that is much faster than their WASM backend. But you lose so much precision in webgl that I have yet to see a network work correctly using that engine.)</p>

<p>But running the neural network is only half the battle. Running transformers requires more software than just the neural net. You also need text tokenization software to convert your text to tokens (symbols) and you need sampling software to convert the neural net’s output probabilities back to symbols. Transformers-js takes care of all that for you.</p>

<h3 id="tokenization">Tokenization</h3>

<p>Step 1 in running a transformer is getting a working tokenizer. Each neural net is optimized to solve a problem and that means each net uses a slightly different tokenizer from each other.
I thought writing the tokenizers would be a piece of cake. I’ve written hundreds of tokenizers in my career in my pursuit of programming language nirvana, but I have never run into the kind of tokenizers that data scientists have come up with.</p>

<p><strong>Side tangent:</strong> did you know that modern tokenizers use classical AI approaches? Neither did I! For example, the T5 symbol list is redundant; you can encode the same sentence many many different ways. In order to correctly tokenize the sentence for input to T5, you have to find the optimal path through the redundant symbol list based on the a-priori probabilities of the symbols. It’s a graph problem, and those are hard. Fortunately, classic AI people loved graph problems and found solutions. Two AI winters ago, people thought graphing techniques would be the foundation of all future AIs. They were wrong, but it’s nice to see these old powerful algorithms live on.</p>

<p>Back to tokenizers. I learned all that graph theory so you don’t have to! I encoded that knowledge into code that a computer can decode to make the magic happen. Behold:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Load the tokenizer</span>
<span class="kd">const</span> <span class="nx">tokenizer</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">AutoTokenizer</span><span class="p">.</span><span class="nx">fromPretrained</span><span class="p">(</span><span class="dl">"</span><span class="s2">t5-small</span><span class="dl">"</span><span class="p">,</span> <span class="dl">"</span><span class="s2">/models</span><span class="dl">"</span><span class="p">);</span>
</code></pre></div></div>

<p>That loads a tokenizer. Currently, I only support Sentence Piece Unigram models (good enough for most nets). I hope to support Byte Pair Encoding in the future (GPT’s preferred tokenization).</p>

<p>With that tokenizer, you can convert strings into token lists:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Tokenize "Hello, world!"</span>
<span class="kd">const</span> <span class="nx">english</span> <span class="o">=</span> <span class="dl">"</span><span class="s2">Hello, world!</span><span class="dl">"</span><span class="p">;</span>
<span class="kd">const</span> <span class="nx">inputTokenIds</span> <span class="o">=</span> <span class="nx">tokenizer</span><span class="p">.</span><span class="nx">encode</span><span class="p">(</span><span class="dl">"</span><span class="s2">translate English to French: </span><span class="dl">"</span> <span class="o">+</span> <span class="nx">english</span><span class="p">);</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">inputTokenIds</code> is a list of integers that represent the symbols in the sentence. Some words are just one symbol. While other, less common or longer words, can be more than one symbol.</p>

<p>I added a little prefix to the string (“translate English to French:”) because I’m building up to a translation demo here and the T5 network, with all its advanced capabilities, needs to be told what to do.</p>

<h3 id="generation">Generation</h3>

<p>Now that we have tokens, we can hand them off to the neural network to be run:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Translate</span>
<span class="kd">const</span> <span class="nx">outputTokenIds</span> <span class="o">=</span> <span class="k">await</span> <span class="nx">model</span><span class="p">.</span><span class="nx">generate</span><span class="p">(</span><span class="nx">inputTokenIds</span><span class="p">,</span> <span class="p">{</span><span class="na">maxLength</span><span class="p">:</span><span class="mi">50</span><span class="p">,</span><span class="na">topK</span><span class="p">:</span><span class="mi">10</span><span class="p">});</span>
</code></pre></div></div>

<p>That’s it! The code takes the input tokens, runs them through the network, and returns a new list of output tokens.</p>

<p>That little <code class="language-plaintext highlighter-rouge">generate</code> function is hiding a lot of work. Most networks generate one token at a time. That means you have to
run them over and over until you get the whole sentence. This can be terribly inefficient if you run the <em>entire</em> network
over and over. Instead, you split it into pieces and run each piece only as it is needed.</p>

<p>The <code class="language-plaintext highlighter-rouge">generate</code> method also has to sample from the neural network’s output probabilities. Networks are not into commitment, and will always output a variety of options. A sampling technique is needed to pick the right one.</p>

<p><em>Greedy</em> sampling is when you just pick the highest probability option. <em>Top-k</em> sampling is when you randomly pick from the top <code class="language-plaintext highlighter-rouge">k</code> probable options. Greedy is good for when you want the most probable option. Top-k is good for when you want to inject a bit of creativity (randomness) into the results. This library supports both. I hope to add more sampling options in the future.</p>

<p>Now that we have a list of output tokens, we can convert them back to a string:</p>

<div class="language-js highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Convert output tokens to a string</span>
<span class="kd">const</span> <span class="nx">french</span> <span class="o">=</span> <span class="nx">tokenizer</span><span class="p">.</span><span class="nx">decode</span><span class="p">(</span><span class="nx">outputTokenIds</span><span class="p">,</span> <span class="kc">true</span><span class="p">);</span>
<span class="nx">console</span><span class="p">.</span><span class="nx">log</span><span class="p">(</span><span class="nx">french</span><span class="p">);</span> <span class="c1">// "Bonjour monde!"</span>
</code></pre></div></div>

<p>The output is “Bonjour monde!” which makes sense given our input of “Hello World”.</p>

<p>That’s it! In about 5 lines of code we executed a neural translation algorithm that ran completely in the browser.</p>

<h3 id="demo">Demo</h3>

<p>I put the code above together into a little web app to demonstrate the library.</p>

<p><a href="https://transformers-js.praeclarum.org">https://transformers-js.praeclarum.org</a></p>

<p>It’s a little translation app that is able to go from English to French, German, and Romanian.</p>

<p>When you run it the first time, it will take some time to download the neural network. After that, the browser can cache it and subsequent reloads will be faster.</p>

<p>It’s hosted on <a href="https://azure.microsoft.com/en-us/services/app-service/static/">Azure Static Web Apps</a> to demonstrate that no logic is running on the server. The server merely provides the neural network data to be downloaded and run on the browser. Pretty cool, huh?</p>

<h2 id="optimizing-models-for-the-browser">Optimizing Models for the Browser</h2>

<p>Now, let’s talk about a few last details to make networks in the browser run <em>well</em>.</p>

<p>Because I’m using the WASM version of ONNX, the neural network is executed on the CPU. It’s therefore beneficial to optimize it to run there. The best optimization right now is to use <em>quantized 8-bit</em> weights. This converts what are otherwise 32-bit floats into 8-bit integers. This is done carefully to preserve as much accuracy as possible and is honestly a bit of a dark art.</p>

<p>Fortunately, there are lots of dark art practitioners and we can stand on their shoulders. The <a href="https://github.com/Ki6an/fastT5">fastT5</a> library converts T5 models from Hugging Face 🤗 to quantized 8-bit models ready to run in the browser. I wrapped that library in a script as I hope to support other model types in the future.</p>

<p>Lastly, I highly recommend running this code in a background web worker. This will prevent the browser from locking up while it’s executing. I didn’t implement that feature in the demo web site but I would say it’s worth the effort in a production app.</p>

<h2 id="conclusion">Conclusion</h2>

<p>While there will always be some big networks out there that need big servers to run them, a surprsing number of neural networks work just fine in the browser. Doing so is good for economic, environmental, and security reasons.</p>

<p><a href="https://github.com/praeclarum/transformers-js">Transformers-js</a> was written to make running language models in the browser simple. I hope you will find it enjoyable to use and I hope you will use it to make cool things.</p>]]></content><author><name></name></author><category term="announcement" /><summary type="html"><![CDATA[TL;DR I wrote a javascript library that lets you run modern transformer neural networks from Hugging Face 🤗 in the browser. It works on mobile browsers, desktop browsers, pretty much everywhere. Check out transformers-js on GitHub to see how it works and checkout out the live translation demo running on my static website: https://transformers-js.praeclarum.org.]]></summary></entry><entry><title type="html">Practical Guide to Solving Hard Problems</title><link href="https://praeclarum.org/2022/02/19/hard-problems.html" rel="alternate" type="text/html" title="Practical Guide to Solving Hard Problems" /><published>2022-02-19T00:00:00+00:00</published><updated>2022-02-19T00:00:00+00:00</updated><id>https://praeclarum.org/2022/02/19/hard-problems</id><content type="html" xml:base="https://praeclarum.org/2022/02/19/hard-problems.html"><![CDATA[<p>I sometimes find myself in a position of needing to write some code
that I’m just not sure how to write. Been there have you?
Here are the steps I take when I’m stumped.
No huge revelations here, just hard-earned advice.</p>

<ol>
  <li>
    <p>Think hard about the problem for a few weeks before typing any code.</p>
  </li>
  <li>
    <p>Type in a function or write a class that has the inputs and outputs you need.</p>
  </li>
  <li>
    <p>Break the function down into multiple steps with clear objectives. You may not know how to achieve those objectives, but that’s a problem for your future self. Right now, you’re just trying to write out the high-level algorithm.</p>
  </li>
  <li>
    <p>Create a function for each of those steps and <code class="language-plaintext highlighter-rouge">throw new NotImplementedException()</code> in each of them. Their names should be long and explanatory and there should be no question about what’s expected of them. It’s <em>really</em> OK if you don’t actually know how to write ‘em.</p>
  </li>
  <li>
    <p>Now, go implement a few of those functions. You know they’re not <em>all</em> hard. Some may even be fun! Build up your confidence and implement the easy ones. It feels good to make progress and it lets the analytical part of your brain run in the background for a bit while you focus on nitty-gritty number types and file IO.</p>
  </li>
  <li>
    <p>Time to tackle some of those harder functions. Go into each of those and break the problem down into steps just like you did before. You’re right, I’m gonna say it: Rinse and repeat. Keep breaking those hard problems down into steps. Turn each of those steps into a function with a clear name. Implement the easy ones. Then break the hard ones down into steps again. Do this over and over again. You’ll be surprised how much you can actually get done.</p>
  </li>
  <li>
    <p>Pretty soon (haha) you will have an 80% complete solution with just a few pesky functions left that throw NotImplemented. Now go scour your favorite package repository, or code repository, or question and answer site, or artificial intelligence programming assistant for implementations. Chances are you’re not the first person to need this particular function or widget. Find some giants, climb on top of them, and scream “Holy shit, there are a lot of smart programmers in the world!”</p>
  </li>
  <li>
    <p>OK, you’ve scoured the inter webs and yet you still have a few pesky NotImplemented exceptions. It’s time to check on those scientists. Enter every SEO permutation of your problem statement into arXiv. Surely others have worked on problems related to one you are trying to solve. They will most likely offer insights or perspective shifts that can help you reframe your problem into something solvable. Do that. Reframe your problem and knock out those NotImplementeds.</p>
  </li>
  <li>
    <p>Now you’re in trouble. If you still have a few NotImplemented exceptions, and there are no giants upon which to stand nor academics obsessing over this particular field, then it’s all up to you. Think big. Think outside the box. Your career depends on it. (Just kidding, I hope.) Perhaps a bath will help you think?</p>
  </li>
</ol>

<p>I think these are steps all programmers take, but sometimes it’s good to spell it out.</p>

<p>I especially value the functional decomposition. Functions are a powerful abstraction, not just for writing less code, but for thinking about problems.</p>

<p>And please don’t misinterpret my use of the word “functions” to mean only those things functional programmers like. I mean any data transformer: from lowly lambdas to state-bearing IO-processing monolith objects.</p>

<p>Thanks for reading! Now go solve those hard problems!</p>]]></content><author><name></name></author><summary type="html"><![CDATA[I sometimes find myself in a position of needing to write some code that I’m just not sure how to write. Been there have you? Here are the steps I take when I’m stumped. No huge revelations here, just hard-earned advice.]]></summary></entry></feed>