Write Inefficient Code

Published Mon Feb 12 2024

While writing some code to sort my alphanumeric note ids I ran into a case of premature optimization. To understand the problem I’ll explain how the ids are structured. Each identifier begins with a number, then alternating between letters and numbers. Here are some examples:

  • 1a
  • 1b
  • 2c17a

I use these ids to group notes in a very general sense. The important thing to understand here is that alphabetical sort does not work since a number can be multiple digits. For example, alphabetically 13c comes before 4a1d. Alphabetical sorting compares a single character at a time 1 comes before 4 instead of 13 being sorted after 4.

To compare two of these keys I split the segments of the id into an array for comparison. With performance in mind, I split all of the identifiers into arrays before sorting. This prevents my code from splitting an identifier into parts more than once.

(fn alphanumeric_sort [identifiers]
  (var result (icollect [k v (pairs identifiers)]
      {:keys (id_to_list v) :id v }))
  (table.sort result an_compare)
result)

Here result is instantiated to collect a new list with my split identifiers from id_to_list. To make this optimization I have moved from operating on a list of strings or identifiers to a list of tables holding a list of identifier segments as well as the original id.

What is the alternative here? If I set aside my desire to create less arrays at runtime I can move the id to list conversion into my comparison function. This will create a new array in each comparison, of which there will be many more than there are items in the array.

(fn alphanumeric_sort [items]
  (table.sort items alphanumeric_compare)
result)

This code still runs plenty fast for sorting my files. Simplifying code at the cost of some invisible memory usage is often times worth it. You can afford to iterate over your data a few thousand times as long as your code is more coherent as a result.