ChatGPT in an Embedded World: Quickly Making A Quote Display

Image of a construction worker standing on a roof with his hands on his hips, staring out at the treeline. Quote says, "ChatGPT, Build Something Here."

This week, I built a small embedded program with ChatGPT at the proverbial coder’s keyboard. The only coding I did was to correct small mistakes and to integrate the chunks I asked ChatGPT to generate. I am still experimenting with the various ways I can communicate with ChatGPT, so bear with me as I try some new things.

For our project, we will use an ESP32 and an OLED screen to display famous quotes retrieved from the Internet. The design will be roughly like this:

We won’t concern ourselves with smaller details, like how the user might configure the ESP32 to connect to a wireless network. At this point, we are concerned with standing up the ‘core technology’ of our ‘product.’

ESP32 Quote Fetching (Written by ChatGPT)

First things first, we need to get our ESP32 to retrieve quotes from the Internet. I was originally dubious about ChatGPT’s ability to write an application that could pull these quotes, so I started by asking it to write a program in Python. With a Python script, I could verify functionality before adding steps like compilation and flashing to the mix.

After proving out its abilities in Python, I asked ChatGPT to write the same program in C. You can see our conversation below. Notice the back-and-forth nature of the conversation, including corrections and specifications I made to ChatGPT:

Travis: Can you write the same program in C for a WROOM ESP32?

ChatGPT: Yes, I can! Here's an example of how you could implement the same program in C for an ESP32 module using the ESP-IDF framework:

...CODE BLOCK...

Travis: This is for esp32 so needs a loop and setup

ChatGPT: You're right, I apologize for that oversight. Here's an updated version of the program that includes a setup and loop function for the ESP32:

...CODE BLOCK...

Travis: Can you replace the printf with a serial out

ChatGPT: Sure, here's the modified program with serial output instead of printf:

...CODE BLOCK...

Travis: Sorry I meant the arduino serial out

ChatGPT: No problem! Here's the modified program with serial output using the Arduino Serial library:
#include <Arduino.h>
#include <WiFi.h>
#include <HTTPClient.h>
#include <ArduinoJson.h>
#include <stdlib.h>
#include <time.h>

const char* ssid = "your_SSID";
const char* password = "your_PASSWORD";

void fetch_quote(char *quote, char *author) {
  WiFiClient client;
  HTTPClient http;
  http.begin(client, "https://api.quotable.io/random");
  int httpCode = http.GET();
  if (httpCode != HTTP_CODE_OK) {
    Serial.printf("HTTP request failed: %s\n", http.errorToString(httpCode).c_str());
    http.end();
    return;
  }
  String payload = http.getString();
  DynamicJsonDocument doc(1024);
  deserializeJson(doc, payload);
  String content = doc["content"].as<String>();
  String author_item = doc["author"].as<String>();
  content.toCharArray(quote, 256);
  author_item.toCharArray(author, 256);
  http.end();
}

void setup() {
  Serial.begin(115200);
  while (!Serial);
  Serial.println("Serial connected!");
  WiFi.begin(ssid, password);
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.println("Connecting to WiFi...");
  }
  Serial.println("WiFi connected!");
  srand(time(NULL));
}

void loop() {
  char quote[256];
  char author[256];
  fetch_quote(quote, author);
  char output[512];
  sprintf(output, "%s\n- %s\n", quote, author);
  Serial.println(output);
  delay(1000);
}

void setup() {
  Serial.begin(115200);
  while (!Serial);
  Serial.println("Serial connected!");
  WiFi.begin(ssid, password);
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.println("Connecting to WiFi...");
  }
  Serial.println("WiFi connected!");
  srand(time(NULL));
}

void loop() {
  char quote[256];
  char author[256];
  fetch_quote(quote, author);
  char output[512];
  sprintf(output, "%s\n- %s\n", quote, author);
  Serial.println(output);
  delay(1000);
}

After a quick attempt at compiling, I see that I don’t have the ‘ArduinoJson.h’ that ChatGPT wants to use, so I ask how to attain that in the following snippet of the conversation:

Travis: How do I use the ArduinoJson.h?

ChatGPT: ArduinoJson.h is a library that provides a simple and efficient way to parse and generate JSON data on Arduino and other embedded systems. To use it, you'll need to follow these steps:

    Download and install the library. You can do this by going to the "Sketch" menu in the Arduino IDE, selecting "Include Library", and then "Manage Libraries". In the Library Manager, search for "ArduinoJson" and click "Install" to install the latest version.

...MORE INSTRUCTIONS (NOT NEEDED FOR OUR CASE)...

Once I followed ChatGPT’s instruction, I found one final bug: There were two ‘setup’ and ‘loop’ blocks. Once I fixed this simple mistake, I then had a program that could be compiled. The next exciting step would be to flash it to the ESP32 and then see if the serial monitor would start displaying quotes!

Attempt #1 Output

Serial connected!
Connecting to WiFi...
Connecting to WiFi...
Connecting to WiFi...
Connecting to WiFi...
Connecting to WiFi...
WiFi connected!
HTTP request failed: connection lost
xV���k@T@0
- ��@�?

HTTP request failed: connection lost
xV���k@T@0
- ��@�?

...MORE OF THIS...

The ESP32 connected to WiFi. It looks like it attempted an HTTP request, but the connection was lost. Let’s ask ChatGPT what is up…

Travis: I am getting 'HTTP request failed: connection lost'

ChatGPT: The error "HTTP request failed: connection lost" could indicate that the ESP32 is losing its Wi-Fi connection during the HTTP request. This could happen if the Wi-Fi signal is weak or unstable.

To troubleshoot this issue, you could try the following steps:

   1. Make sure that the ESP32 is connected to a stable Wi-Fi network with a strong signal.

   2. Increase the timeout for the HTTP request by calling the setTimeout method on the HTTPClient object. For example:

...CODE BLOCK...

This will give the ESP32 more time to complete the HTTP request before timing out.

    3. Try using a different HTTP client library, such as the WiFiClient library, to make the HTTP request. For example:

...CODE BLOCK...

This code uses the WiFiClient library to make an HTTP GET request to the API server. Note that this code does not use HTTPS, so it is less secure than the previous code that used WiFiClientSecure. However, it may be more reliable in some situations.

I hope this helps! Let me know if you have any other questions.

ChatGPT gave me a list of debug options. Super helpful!

I took ChatGPT’s list out of order and honed in on a small detail in the ‘WiFiClientSecure’ debug option. I noticed in the original code that ChatGPT was accessing the quotable API using an HTTPS connection. I simply removed the ‘s’ from ‘https’ to access it without SSL security and… (drum-roll, continue onto the next section please!)…

Attempt #2 Output, Nicely Done ChatGPT!

Serial connected!
Connecting to WiFi...
Connecting to WiFi...
Connecting to WiFi...
WiFi connected!
Never find fault with the absent.
- Alexander Pope

One machine can do the work of fifty ordinary men. No machine can do the work of one extraordinary man.
- Elbert Hubbard

Real success is finding your lifework in the work that you love.
- David McCullough

A dream is your creative vision for your life in the future. You must break out of your current comfort zone and become comfortable with the unfamiliar and the unknown.

We have quotes!

ESP32 OLED Display Output

All right, now we need to set up the quotes we get from the Internet to display on our OLED screen.

Check out my continued conversation with ChatGPT as I ask for help on the display portion of the project.

Travis: I have a 0.96" OLED screen, are you familiar with them?

ChatGPT: Yes, I'm familiar with 0.96" OLED screens! They are commonly used with microcontrollers and are great for displaying text and graphics in small projects. There are a variety of libraries available for interfacing with OLED screens, so depending on the specific screen you have, there may be different libraries that work best.

If you have any questions about using an OLED screen with your project, feel free to ask!

Travis: I would like to display the fetched quote on the OLED screen instead of serial out

ChatGPT: Sure, here's an example code snippet that shows how you can display the fetched quote on a 0.96" OLED screen using the Adafruit_SSD1306 library:

...CODE BLOCK (See Below)...

In this example code, we're using the Adafruit_SSD1306 library to initialize and interact with the OLED screen. The fetched quote is then displayed on the OLED screen using the display object's println() and setCursor() methods.

Note that you'll need to modify the API_HOST, API_URL, your_SSID, and your_PASSWORD variables to match your specific requirements.

Travis: What pins are used for the OLED screen on the ESP32?

ChatGPT: The pins used for the OLED screen on the ESP32 depend on the specific breakout board you're using. However, many 0.96" OLED screens for the ESP32 use the I2C protocol to communicate with the microcontroller.

Here's an example pinout for the popular ESP32 DevKitC board and an OLED screen using the I2C protocol:

...TABLE (See Below)...

Note that the pin names on your OLED screen may be different depending on the manufacturer, so it's important to consult the documentation for your specific screen to ensure you're using the correct pins. Additionally, you may need to install additional libraries to interface with the OLED screen depending on the specific breakout board you're using.

To summarize our conversation, I asked if ChatGPT was familiar with the OLED screen I was using. Then, I asked if ChatGPT could generate the code to display the quote fetched on the OLED screen. Finally, I asked how I should wire my ESP32 to the OLED screen.

Below you will find the code and pin-out table that ChatGPT provided me. I noticed a couple of things, namely:

  • ChatGPT has nearly zero deterministic behavior between code generations. Each one was noticeably unique.
  • Anecdotally, ChatGPT seems to have organized the code better with the added complexity of displaying to the OLED screen.
#include <Wire.h>
#include <Adafruit_GFX.h>
#include <Adafruit_SSD1306.h>
#include <HTTPClient.h>
#include <ArduinoJson.h>

#define SCREEN_WIDTH 128 // OLED display width, in pixels
#define SCREEN_HEIGHT 64 // OLED display height, in pixels
#define OLED_RESET -1    // Reset pin # (or -1 if sharing Arduino reset pin)
#define SCREEN_ADDRESS 0x3C // I2C address of the OLED screen
#define API_HOST "api.quotable.io"
#define API_URL "/random"

Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET);

void setup() {
  Serial.begin(115200);

  // Initialize OLED display
  if(!display.begin(SSD1306_SWITCHCAPVCC, SCREEN_ADDRESS)) {
    Serial.println("SSD1306 allocation failed");
    while (1);
  }

  // Clear the display buffer
  display.clearDisplay();

  // Set text color to white
  display.setTextColor(SSD1306_WHITE);

  // Set font size
  display.setTextSize(1);

  // Connect to Wi-Fi network
  WiFi.begin("your_SSID", "your_PASSWORD");
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.println("Connecting to WiFi...");
  }
  Serial.println("Connected to WiFi");
}

void loop() {
  // Make HTTP request to API
  HTTPClient http;
  http.begin(API_HOST, 80, API_URL);
  int httpCode = http.GET();
  if (httpCode == HTTP_CODE_OK) {
    String payload = http.getString();

    // Parse JSON response
    const size_t bufferSize = JSON_OBJECT_SIZE(2) + JSON_OBJECT_SIZE(3) + 160;
    DynamicJsonDocument doc(bufferSize);
    deserializeJson(doc, payload);
    const char* quote = doc["content"];
    const char* author = doc["author"];

    // Display quote on OLED screen
    display.clearDisplay();
    display.setCursor(0, 0);
    display.println(quote);
    display.setCursor(0, 20);
    display.println("- " + String(author));
    display.display();
  } else {
    Serial.println("HTTP request failed");
  }

  // Wait 10 seconds before making the next request
  delay(10000);
}
ESP32 PinOLED Pin
GPIO21SDA
GPIO22SCL
3.3VVCC
GNDGND
Wiring table provided by ChatGPT

Display, Attempt #1 Output

I am blown away by the results of this. It worked! There was only one flaw: When a quote was too long, it wrapped around the screen over the top of the author’s name. This display problem was due to ChatGPT’s use of hard-coded location coordinates. I could see two solutions right out of the gate:

  1. Have the quote scroll across the screen instead of wrapping.
  2. Have the author’s name appended to the quote with a new line just before it.

Either of these would have worked, but both required too much coding for me. So, I asked ChatGPT to do it!

Travis: The text wraps and is colliding with the authors name, how can I resolve this?

ChatGPT: To prevent text from colliding with other elements on your OLED screen, you may need to adjust the font size, spacing, or position of your text.

Here are some suggestions for resolving the text wrapping issue:

List Items 1..2

3. Move the author's name to a different location on the OLED screen, such as below the quote text or in a separate area of the screen. This will create more space for the quote text to be displayed without colliding with other elements.

List Items 4..5

Travis: How would I implement option #3

ChatGPT: To move the author's name to a different location on the OLED screen, you can adjust the Y position of the text using the display.drawString() function. Here's an example of how to move the author's name to be displayed below the quote text:

...CODE BLOCK...

In this example, we first display the quote text using the drawStringMaxWidth() function, which automatically wraps the text based on the width of the OLED screen. We then use the setCursor() function to move the cursor to a new line below the quote text, and display the author's name using the print() function.

Note that you may need to adjust the Y position of the cursor depending on the size of your OLED screen and the amount of spacing you want between the quote text and the author's name.

Travis: How would I dynamically change the authors line to be below the quote text

ChatGPT: If you want to dynamically change the position of the author's name based on the length of the quote text, you can calculate the height of the quote text and adjust the Y position of the author's name accordingly.

Here's an example of how to do this:

...MORE DEBUG (2 More Rounds of Questions)...

Travis: Is there a way to simply get the cursors location after a println?
 
ChatGPT: In Arduino, you can use the getTextBounds() function to get the size of the text that was just printed. You can then use this information to calculate the position of the next line of text.

Here's an example:

...CODE BLOCK...

In this example, we start by printing the first line of text. We then use the getTextBounds() function to get the size of the printed text. The function takes six parameters: the string of text to measure, the x and y coordinates of the cursor, and four output variables that will contain the position and size of the text.

After getting the size of the printed text, we calculate the position of the next line of text by adding some padding between lines. Finally, we print the second line of text at the new cursor position.

ChatGPT’s Solution to the Text Collision

Basically, I asked ChatGPT how to resolve the text collision with the author’s name and it gave me a list of suggestions. I opted for moving the author’s name below the quote on the screen. I then asked ChatGPT to generate the code for me. The new code broke my compile by requiring things that were non-existent in the libraries I have. After a few more questions, I had an example of a way to do it without using the libraries it had suggested. Below is the example that got me “un-stuck”:

int x, y; // x and y coordinates of the cursor

// Print the first line of text
display.println("Hello, world!");

// Get the size of the printed text
int16_t x1, y1;
uint16_t w, h;
display.getTextBounds("Hello, world!", x, y, &x1, &y1, &w, &h);

// Calculate the position of the next line of text
x = x + w + 5; // add some padding between lines
y = y + h;

// Print the second line of text
display.println("This is the second line.");

I couldn’t merely copy and paste the above snippet of ChatGPT’s code into the overall program. Instead, I had to use it as a general example to guide me as I implemented my own solution. As the architect rather than the technical developer, this was almost too much coding work, but it was still helpful for finishing out the quote display sketch. The final code is below:

#include <Wire.h>
#include <Adafruit_GFX.h>
#include <Adafruit_SSD1306.h>
#include <HTTPClient.h>
#include <ArduinoJson.h>

#define SCREEN_WIDTH 128 // OLED display width, in pixels
#define SCREEN_HEIGHT 64 // OLED display height, in pixels
#define OLED_RESET -1    // Reset pin # (or -1 if sharing Arduino reset pin)
#define SCREEN_ADDRESS 0x3C // I2C address of the OLED screen
#define API_HOST "api.quotable.io"
#define API_URL "/random"

Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET);

void setup() {
  Serial.begin(115200);

  // Initialize OLED display
  if(!display.begin(SSD1306_SWITCHCAPVCC, SCREEN_ADDRESS)) {
    Serial.println("SSD1306 allocation failed");
    while (1);
  }

  // Clear the display buffer
  display.clearDisplay();

  // Set text color to white
  display.setTextColor(SSD1306_WHITE);

  // Set font size
  display.setTextSize(1);

  // Connect to Wi-Fi network
  WiFi.begin("your_SSID", "your_PASSWORD");
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.println("Connecting to WiFi...");
  }
  Serial.println("Connected to WiFi");
}

void loop() {
  // Make HTTP request to API
  HTTPClient http;
  http.begin(API_HOST, 80, API_URL);
  int httpCode = http.GET();
  if (httpCode == HTTP_CODE_OK) {
    String payload = http.getString();

    // Parse JSON response
    const size_t bufferSize = JSON_OBJECT_SIZE(2) + JSON_OBJECT_SIZE(3) + 160;
    DynamicJsonDocument doc(bufferSize);
    deserializeJson(doc, payload);
    const char* quote = doc["content"];
    const char* author = doc["author"];

    display.clearDisplay();
    display.setCursor(0, 0);
    display.println(quote);

    // Get the size of the printed text
    int x, y; // x and y coordinates of the cursor
    int16_t x1, y1;
    uint16_t w, h;
    display.getTextBounds(quote, x, y, &x1, &y1, &w, &h);

    // Calculate the position of the next line of text
    x = x + w + 5; // add some padding between lines
    y = y + h;


    display.setCursor(x, y);
    display.println("- " + String(author));
    display.display();
    

  } else {
    Serial.println("HTTP request failed");
  }

  // Wait 10 seconds before making the next request
  delay(10000);
}

Conclusion

I was able to get a functioning copy of the core technology of this project in under 2 hours. Granted, I did need some background knowledge in coding to be able to integrate and get required libraries. Keep in mind though that I still got stuck, but ChatGPT was able to guide me through each time.

Image shows a fully assembled version of the system with functioning code generated by ChatGPT.

So far, I am concerned about a few things after using ChatGPT:

  • Does it actually make things faster?
  • Blind acceptance is problematic.
  • I’m not so keen to be reliant on something which exists exclusively online.

Let’s quickly address these three points.

Does it actually make things faster?

I have to question how much faster this actually makes an engineer. In the case of this morning, I think I completed the goal faster than I might have otherwise. However, there is a case to be made that the large amount of output from ChatGPT can easily make an engineer feel like they are making headway, when in fact they are not.

Blind acceptance is problematic (ChatGPT might have your back)

In the article today, ChatGPT offered a few different API’s for generating quotes from the Internet. This is a problem if ChatGPT’s training data happened to include malicious APIs and someone blindly accepted it as “working.” The same concern applies to libraries suggested by ChatGPT.

Reliance on something that exists only online

This concern is probably the least troublesome of all three, as most developers will presumably have an Internet connection. However, if you are relying on ChatGPT for the majority of your work, you may be lost if your Internet connection is compromised. In the future, it would be awesome to have an onsite version of the Large Language Model to ensure engineers have guaranteed access. A model developer firm could license local versions (based on a general set of training data) to individual corporations. The client companies could then continue to train their individually-licensed models on their own internal data. Intellectual property would be safeguarded and models would be more versatile and accurate when responding to user needs.

Wrapping Up

Thank you for reading this week’s post! It was really quite amazing working with ChatGPT. I learned some new and crazy things, like the Python “Easter egg” built-in library called this that contains a poem written by Tim Peters (?!).

As always, please feel free to leave a comment below!

-Travis

My Other Work

Asyncio in Python: The Super Power No One Uses

The image shows three W's superimposed on each other. The text in the image says on the top, "Let's Share What We Know." On the bottom of the image, it says "World Wide Web," with 2 blue and yellow Python cross icons between the words "World Wide" and also between the words "Wide Web." The graphic is here to remind us of the large scale of the Internet. It is composed of over 4 billion IPV4 addresses! In this article, I use the asyncio library to quickly scan all of these addresses with a single process.

This week, we explore the asyncio library and calculate how long it would take for a single process to scan the entire Internet’s worth of IPV4 addresses (…only 4,294,967,296 total addresses!).

In this article, I briefly walk through the asyncio library and create a real-world example!

If you are new to Python or programming in general, I would advise you to learn the basics before approaching asyncio. This is a bit of an advanced topic, so having a solid grasp on the foundations of synchronous programming will be a requirement here.

A [Kind of] Experienced Software Guy

Asyncio

Asyncio was introduced as a built-in library in Python version 3.4. It is intended for IO-bound operations. One example is network communication, as it is often the case that you will wait for data to be returned from a remote server.

In the time spent waiting, however, you may have an opportunity to complete other work. This is where an asynchronous application shines.

You define a co-routine by simply adding the ‘async’ keyword in front of the normal ‘def’ keyword, as shown below:

async def main():

When you want to run this function, it must be done via a runner like this:

asyncio.run(main())

The Mental Model of Asyncio

When a runner is created to execute one or many asynchronous tasks, it starts by creating an event loop.

“The event loop is the core of every asyncio application.”

Python Docs

Below is a very simplistic mental model for a basic asyncio application.

Diagram of a very simplistic mental model for a basic asyncio application.

Each time an asyncio ‘Task’ has to ‘await’ something, there is an opportunity for the other tasks in the loop to execute more of their work if they are no longer ‘awaiting.’

Our mental model looks very similar to a single-threaded synchronous application. However, each of the asynchronous tasks are being handled independently, as long as no single task is hogging attention from the event loop.

Concrete Example in Asyncio

Now we have all that is needed to develop a solution for an important question:

How Long Would it Take?

How long would it take for a single-threaded program to scan the entire Internet’s-worth of IPV4 Addresses?

Let’s start by defining a class that can generate all of the IP addresses.

@dataclass
class IPAddress:
    subnet_ranges: List[int]

Above, we define a data class that simply holds a list of subnet values. If you are familiar with IPV4, you probably guessed that this list will have a length of four. I didn’t go out of my way to enforce the length, but in a production environment it would be ideal to do so.

We now have something that can hold our IP address in a sane way, but we still need to generate them.

@classmethod
def get_all_ips(cls, start: int = 1, stop: int = 255) -> 'IPAddress':
    first = second = third = fourth = start

    while fourth < stop:
        first += 1

        if first >= stop:
            first = start
            second += 1
        if second >= stop:
            second = start
            third += 1
        if third >= stop:
            third = start
            fourth += 1
        curr_ip = cls([fourth, third, second, first])

        if cls.is_valid_ip(curr_ip):
            yield curr_ip

For this, I introduce a factory class method that will yield IP addresses with the default range of ‘1.1.1.1’ to ‘255.255.255.255.’ The method increments the least-significant subnet value and rolls over to the higher-order subnets each time its value reaches 255. The bulleted list below illustrates the method’s address outputs.

  • 1.1.1.1
  • 1.1.1.2
  • 1.1.1.3
  • 1.1.1.254
  • 1.1.1.255
  • 1.1.2.1
  • 255.255.255.254
  • 255.255.255.255

If you have a keen eye, you will have likely noticed the ‘is_valid_ip’ class method. It’s called just before yielding to the calling function.

This function simply checks if the IP address is in a valid public range as defined by the private ranges. See below:

@classmethod
def is_valid_ip(cls, ip_Address: 'IPAddress') -> bool:
    if ip_Address.subnet_ranges[0] == 0:
        return False
    
    if ip_Address.subnet_ranges[0] == 10:
        return False
    
    if ip_Address.subnet_ranges[0] == 172 and 16 <= ip_Address.subnet_ranges[1] <= 31:
        return False
    
    if ip_Address.subnet_ranges[0] == 192 and ip_Address.subnet_ranges[1] == 168:
        return False
    
    return True

Are We Asynchronous Yet?

No, not yet…but soon! Now that we have our IP address generator defined, we can start building an asynchronous function that will do the following:

  • Iterate our generator function an N-number of times to get a batch of IPs.
  • Create an asynchronous task for each IP address in our batch which checks if a port is open.

By adding timers to this code, we will find out how long it would theoretically take! Keep in mind that we already know the performance impacts of using Python vs. C++, but this is a toy problem, so…Python is perfect.

Iterate our Generation Function

for _ in range(IP_COUNT_PER_GATHER):
    try:
        next_group_of_ips.append(ip_addr_iter.__next__())
    except StopIteration:
        complete = True
        break

Above is how we will iterate our IP generator function.

Create an Asyncio Task For Each IP

for i in range(IP_COUNT_PER_GATHER):
           async_tasks.append(asyncio.create_task(check_port(str(next_group_of_ips[i]))))

We create a task for each IP port check and store a reference to the task.

Asyncio: Await Results

With multiple tasks executing at the same time, it doesn’t win much if we have to ‘await’ each individually. To solve this problem, the asyncio library has a function called ‘gather.’ See below for how I used ‘gather’ in this application:

await asyncio.gather(*async_tasks)

for i in range(IP_COUNT_PER_GATHER):
    if async_tasks[i].result():
        ip_addresses_found.put(str(next_group_of_ips[i]))

By ‘awaiting’ the ‘gather’ function, we are actually awaiting all tasks in the list. When all have completed, if tasks returned not None, we add it to our queue of IPs that we may want to process later.

All Together!

The whole function together looks like this:

async def main(ip_addresses_found: Queue, start: int = 1, end: int = 255):
    ip_addr_iter = iter(IPAddress.get_all_ips(start, end))
    complete = False
    
    while not complete:
        next_group_of_ips = []
        async_tasks = []
        
        for _ in range(IP_COUNT_PER_GATHER):
            try:
                next_group_of_ips.append(ip_addr_iter.__next__())
            except StopIteration:
                complete = True
                break
        
        for i in range(IP_COUNT_PER_GATHER):
            async_tasks.append(asyncio.create_task(check_port(str(next_group_of_ips[i]))))
        
        await asyncio.gather(*async_tasks)

        for i in range(IP_COUNT_PER_GATHER):
            if async_tasks[i].result():
                ip_addresses_found.put(str(next_group_of_ips[i]))

Conclusion

The time has come to share my results! These will vary based on Internet latency and machine hardware.

I set my scan to do 10k IPs per batch, with a timeout on the connection of 1 second. This resulted in an average batch runtime of ~1.3 seconds.

I didn’t let it run to see how long it would actually take (running this program had major effects on our ability to use the Internet), but if you divide the total number of possible IPs by our batch size of 10k, you get ~430k batches. At 1.3 seconds per batch, that totals 558,346 seconds, or 6.46 days of constant running.

Not as bad as I originally thought 🙂

Fun fact: I first was introduced to co-routines while programming my Paper Boy game in Unity!

Thanks for reading this week! Please ‘like’ if you enjoyed the content. Feel free to leave any comments or suggestions as well!

-Travis

C++ Wrapped in Python: A Daring but Beautiful Duo

Image depicting a coal miner picking away at a ceiling just below a children's playground. The coal miner represents C++ and the children's playground represents the land of Python.

Have you ever wondered how Python libraries are so fast? Well, it’s because they aren’t written in Python! C and C++ compose the major backbone that make Python so powerful. This week, we will look at how I adopted this idea for the spelling maze generator we built previously!

The week before last week’s post really exemplified why being able to program in C++ gives you unending power in the realm of efficiency gains. Unfortunately, however, most of the world has decided to move on to the cleaner and more abstract sandboxes of Python, mainly to reduce the development cost of building new software. I guess some people just aren’t here for the pain like others…

Anyways, this week’s post tackles how the hardcore C++ programmer can still make friends in the Python playground.

It’s Already Built

Cartoon depiction of a man standing next to a wheel with his arm resting at the peak. There is text in the upper right corner asking "Is This New?" The intention being to ask if we are introducing anything new to C++.

Lucky for us, people share their tunnels to brighter levels of existence — all we have to do is look and follow!

The simplest way to make your C++ program usable in the high-level language of Python is to use the Pybind11 Library.

pybind11 is a lightweight header-only library that exposes C++ types in Python and vice versa, mainly to create Python bindings of existing C++ code.

Pybind11 Frontpage

What does it mean?

We just have to include a header file to be able to bind functions to a module that is importable/callable from a Python script.

Where would this be useful?

I am glad you asked! We already have a project that is a perfect candidate. Do you remember when we produced spelling mazes to help my son study for spelling? Then, we rewrote it in C++ to measure the efficiency gains…which were sizeable.

Well, why not pair the two? Then we can keep our web front-end built in Python and back it up with our C++ library!

Code Time (C++, CMake)

So really, our code is going to be very minimal. All we need to do is:

  • Create a single function that accepts a word and saves our maze to a file
  • Bind that function to a Python module using Pybind11’s PYBIND11_MODULE macro

The Function (C++)

void generate_maze(std::string word, int grid_width, int grid_height, std::string file_prefix, int block_width = 20, int block_height = 20){
    generator.seed(time(0));
    WordMaze m(word, grid_width, grid_height, block_width, block_height);
    m.save_to_png(file_prefix + word + ".png");
}

Above, you can see the function we use to create our WordMaze object in C++ and save it off to a file. You might notice we are taking in a few more parameters than we had previously discussed. This is just to make it easier to configure the maze from the Python side.

The Module Binding (C++)

PYBIND11_MODULE(SpellingMaze, m) {
    m.def("generate_maze", &generate_maze, "A function to generate a maze.");
}

Could it really be that simple? Well, yes and no. This is the code that is required, but actually building it is the part that will sneak up on you (especially if you are on a Windows machine).

Building

To preface this section, I will say:

If you are building a first-party app without third party dependencies in C++, building with Pybind11 is going to be really easy (their website does an excellent job of walking you through the many ways to build).

Travis (He’s a NOOB at C++ build systems)

However, we don’t have it that easy, since we need to also link against SFML.

Story Time

So, with my little experience setting up C++ build systems (none), I set out to combine the powers of SFML and Pybind11 to create a super fast high-level Maze Generator!

SFML, Pybind11 and MSYS2 walk into a bar…

Previously, I wrote an article on how to set up a Windows build environment for compiling simple SFML applications. Looking back now, I realize there are definitely better ways to do this.

For this project, I struggled to wrap my head around including Pybind11 in my already convoluted SFML build system. From the examples given on the Pybind11 website, it was clear my CMake approach wasn’t their go-to option for building.

In fact, the CMake example they give requires that the entire Pybind11 repository be included as a submodule to my repository. So, after a few failed attempts at copy-pasting their CMake config, I took a different approach.

Side note: I like using Visual Studio Code. It is by far my favorite IDE with all of its plugins and remote development solutions. However, in this case, the CMake plugin makes things nearly impossible to understand which configuration is being generated.

Eventually, I opted for using MSYS2 directly. I was able to install the Pybind11 and SFML dependencies using the following commands:

pacman -S mingw-w64-x86_64-pybind11
pacman -S mingw-w64-x86_64-sfml

This made finding them as easy as including these lines in my CMakeLists.txt:

find_package(SFML 2.5 COMPONENTS system window graphics network audio REQUIRED)
find_package(pybind11 REQUIRED)

Here is our final CMakeLists.txt file:

cmake_minimum_required(VERSION 3.12 FATAL_ERROR)

project(SpellingMaze VERSION 0.1)

find_package(SFML 2.5 COMPONENTS system window graphics network audio REQUIRED)
find_package(pybind11 REQUIRED)

pybind11_add_module(SpellingMaze src/spelling_maze.cpp)

target_link_libraries(SpellingMaze PRIVATE sfml-graphics)

Conclusion

Just like that, we can bring the power of C++ to the world of Python (even on a Windows machine!). I hope you enjoyed this post! I am always open to feedback. Let me know if there is anything you would like me to dive into, attempt, and/or explain! I am more than willing to do so. Please add your questions and suggestions in the comments below!

Thanks!

-Travis

Traceroute in Python: A Powerful Wrapper Guide Exposed

Image shows penguin wrapped by a Python. This is supposed to convey the main idea of this article, which is to create a Linux program wrapper with the traceroute program to make tracing a network route easier for a Python user.
Part of this image was generated by DALL-E 2

This week, we will create a Linux program wrapper for the ‘traceroute’ program to make tracing a network route easier for a Python user. By the end of this article, I will have demonstrated a few useful techniques in data parsing and object orientation that you might find helpful.

The Target: ‘traceroute,’ Linux, and Python

Here is an abstract view of the entire system:

Image shows entire system from a high-level view. 'Python Business Logic' is at the top. 'Python Linux Layer' is in the middle. The 'Linux' layer is at the bottom. Although not shown in the diagram, the traceroute program resides in the Linux layer.

In the diagram, the ‘Python Linux Layer’ is the thing we want to program. ‘Traceroute’ resides at the lowest ‘Linux’ layer.

We want to wrap the traceroute tool on a Linux system and make it usable in a Python program. ‘Traceroute’ lists the route that a network packet follows in order to retrieve the information you seek from the remote host.

A Plan

We need to be able to:

  • Call the program from Python and capture its data
  • Parse the data into an importable and usable object

The first will be relatively easy, but the second is where the work lies. If we do it in a systematic way and keep the goal in mind, however, it shouldn’t be too difficult.

Calling the ‘traceroute’ Program and Storing its Output

First things first, we need to call the program…

import subprocess

subprocess.call(["traceroute", "-4", "periaptapis.net"])

Wow! That was easy!

To explain this a little bit, we are using ‘traceroute’ in the IPV4 mode (the ‘-4’ flag).

Right now, it is printing directly to the console, which is not ideal — we need that data for parsing!

To accomplish this, we will switch the method we call from ‘call’ to ‘run.’ This method has a ‘capture_output’ parameter. According to the documentation, when ‘capture_output’ is set to True, the ‘run’ method will return an object which contains the text of the program’s output in stdout.

import subprocess

ret = subprocess.run(["traceroute", "-4", "periaptapis.net"], capture_output=True)

We can access that captured output like this:

print(ret.stdout.decode())

We use the ‘decode’ method because stdout is in bytes and we need a string (more on that here).

Wow! Only 3 lines of code and we already have the data in a place that is very accessible!

Data Parsing

One of the most useful skills I think I’ve gained from programming in Python is data parsing.

Specifically, as a firmware engineer, I produce a lot of serial log data for the boards that I work on. Without the data parsing techniques I have accumulated, I would be spending the majority of my time scrolling long text files looking for less than 1% of the information they hold. Not just once either, because I (like every other programmer) never get it right on the first try.

Data parsing is also useful elsewhere in website scraping, big data collection, and more. Once you get fast enough, it will be hard not to write a parser for everything.

Taking a Peek at our ‘traceroute’ Output Data

The 3-line program we wrote above produces the following output:

traceroute to periaptapis.net (147.182.254.138), 30 hops max, 60 byte packets
 1  home (192.168.0.1)  0.267 ms  0.618 ms  0.583 ms
 2  boid-dsl-gw13.boid.qwest.net (184.99.64.13)  3.425 ms  3.177 ms  3.032 ms
 3  184-99-65-97.boid.qwest.net (184.99.65.97)  3.143 ms  3.458 ms  3.038 ms
 4  ae27.edge6.Seattle1.Level3.net (4.68.63.133)  13.147 ms  13.216 ms  30.300 ms
 5  * * *
 6  * DIGITAL-OCE.edge9.SanJose1.Level3.net (4.71.117.218)  29.985 ms *
 7  * * *
 8  * * *
 9  * * *
10  147.182.254.138 (147.182.254.138)  31.194 ms  30.308 ms  29.753 ms

Oof, that is going to be a pain to parse out the data we want. Let’s take this apart.

The first line is something we already know – we are tracing the route to ‘periaptapis.net.’ From there, each line represents a station that the packet is routed through. These “stations” are things like routers and gateways. The last line is the host that we wanted to reach.

By default, there are 3 queries per station. This helps the user know a couple of things, like average round-trip time and whether there are other hosts that respond to packets at that station number. Below is the breakdown for one of our lines of output:

Example of traceroute output, broken down by component (station number, hostname and IP address of the station, Query #1 round-trip time, Query #2 round-trip time, and Query #3 round-trip time).
  1. The station number
  2. The hostname and IP address of the station
  3. Query #1 round-trip time
  4. Query #2 round-trip time
  5. Query #3 round-trip time

If there is an asterisk in a field, the station didn’t respond to the packet we sent. Also, there is a chance that we get a different host in any of the three queries for a given station number. This is due to network configurations to handle incoming requests.

Simplifying Output

To simplify the output of the ‘traceroute’ program and make our parsing lives easier, we will introduce the ‘-n’ and ‘-N1’ flags to our ‘traceroute’ program call.

-N1

The ‘-N1’ tells the ‘traceroute’ program to only use one packet at a time.

Sending one packet at a time is just a kindness to the devices we will be tracing so we don’t send a bunch of packets each time we want to trace a route. The downside is that the results will be slower to come in.

-n

The ‘-n’ simply tells traceroute to not print the hostnames, simplifying things further by obtaining only the IP address of the station.

With those two flags added, our call now looks like this:

ret = subprocess.run(["traceroute", "-4", "-N1", "-n", "periaptapis.net"], capture_output=True)

Our call also produces a clean output:

traceroute to periaptapis.net (147.182.254.138), 30 hops max, 60 byte packets
 1  192.168.0.1  0.657 ms
 2  184.99.64.13  9.288 ms
 3  184.99.65.97  3.681 ms
 4  4.68.63.133  14.012 ms
 5  4.69.219.65  29.552 ms
 6  4.71.117.218  30.166 ms
 7  *
 8  *
 9  *
10  147.182.254.138  30.837 ms

This will make our lives much easier!

Coding an Object

Finally we get to the object creation that will be importable and usable by another Python program. We will call this object ‘TraceRouteInstance’ and it will have a single constructor input of ‘hostname_or_ip,’ which we will use in our ‘traceroute’ call.

So far, our object looks like this:

import subprocess

class TraceRouteInstance:
    def __init__(self, hostname_or_ip: str):
        self.traceroute_data = subprocess.run(
            ["traceroute", "-4", "-N1", "-n", hostname_or_ip], 
            capture_output=True
            )

We store our ‘traceroute’ return data in a ‘traceroute_data’ variable for later parsing. To store each of the stations, we will use a named tuple, shown below:

Station = namedtuple('Station', ['ip', 'latency(ms)'])

The syntax might look a little strange, but it enables us to reference the data in the tuple by a name which looks cleaner in the long run.

Parsing

Next we need to parse the output.

One way to do this would be by stripping leading and trailing white space on each line, then checking for a digit at the first character. Stripping the line of white space at the start will ensure that the line has a digit as its first character (if it is a line we care about).

This is how a newcomer might approach the problem.

However, to avoid so much string manipulation, I always suggest using regular expression for data extraction from lines of text. It is very difficult to understand regular expressions in the beginning, but when you understand them, they will save you headaches down the road and immensely reduce the complexity of data parsing.

Enter Regular Expression

Let’s take a look at the regular expression we will be using for this exercise:

"(?P<station_number>\d+)  (?P<ip_address>\d+\.\d+\.\d+\.\d+)  (?P<latency>\d+\.\d+) ms"

In this, we have 3 named groups: ‘station_number,’ ‘ip_address,’ and ‘latency.’ We can use this regular expression to search a line and reference the named groups to extract the data we want.

Our parse function looks like this:

    def parse_data(self):
        station_regex = r"(?P<station_number>\d+)  (?P<ip_address>\d+\.\d+\.\d+\.\d+)  (?P<latency>\d+\.\d+) ms"

        for line in self.traceroute_data.split("\n"):
            re_match = re.search(station_regex, line)

            if re_match:
                ip_address = re_match.group("ip_address")
                latency = float(re_match.group("latency"))

                self.route_list.append(Station(ip_address, latency))
            elif '*' in line:
                self.route_list.append(Station('*', '*'))

We take the output line-by-line, searching the line for our regular expression. If the regular expression is found, we use the named groups to extract the data from the line, adding it to our ‘route_list.’

If we don’t find the regular expression but we see an asterisk, we assume that the station didn’t respond and add a default value to our ‘route_list.’

Final Object Code

Finally, our importable ‘traceroute’ Python wrapper looks like this:

import subprocess
import re
from collections import namedtuple

Station = namedtuple('Station', ['ip', 'latency_ms'])

class TraceRouteInstance:
    def __init__(self, hostname_or_ip: str):
        self.traceroute_data = subprocess.run(
            ["traceroute", "-4", "-N1", "-n", hostname_or_ip], 
            capture_output=True
            ).stdout.decode()
        self.route_list = []

        self.parse_data()
    
    def parse_data(self):
        station_regex = r"(?P<station_number>\d+)  (?P<ip_address>\d+\.\d+\.\d+\.\d+)  (?P<latency>\d+\.\d+) ms"

        for line in self.traceroute_data.split("\n"):
            re_match = re.search(station_regex, line)

            if re_match:
                ip_address = re_match.group("ip_address")
                latency = float(re_match.group("latency"))

                self.route_list.append(Station(ip_address, latency))
            elif '*' in line:
                self.route_list.append(Station('*', '*'))

As a simple test, let’s write a main function in this script that utilizes the above object and prints out the trace list.

if __name__ == "__main__":
    tr = TraceRouteInstance("periaptapis.net")
    print(tr.route_list)

If you run this, it will eventually produce the following output:

[Station(ip='192.168.0.1', latency_ms=0.624), Station(ip='184.99.64.13', latency_ms=3.244), Station(ip='184.99.65.97', latency_ms=3.445), Station(ip='4.68.63.133', latency_ms=13.911), Station(ip='4.69.219.65', latency_ms=29.505), Station(ip='4.7.18.10', latency_ms=30.912), Station(ip='*', latency_ms='*'), Station(ip='*', latency_ms='*'), Station(ip='*', latency_ms='*'), Station(ip='147.182.254.138', latency_ms=31.139)]

This output shows that we have successfully extracted the route that a packet would take from my machine to requesting data from ‘periaptapis.net.’

Conclusion

In this post, we explored wrapping a Linux program with a Python interface to make it more accessible by another Python program. If you run this program, however, you will notice how long an object takes to be created. This is due to the length of time it takes for a route to be traced by using ‘traceroute.’

In a future post, we might explore options on how to mitigate the effects of this load time for external programs by using the ‘asyncio’ built-in Python library or by multi-threading.

I hope you enjoyed this week’s post! Please add any comments, suggestions, or questions that you may have! If you are interested in other articles I have written, check out these links:

Thank you for reading!

-Travis

C++ > Python EXPOSED: Powerful Language Makes Case with Data

C++ wrestles Python

Has my ego inflated in the last couple of weeks from programming in C++? Maybe, maybe not… All I know is that C++ is leagues faster than Python. Don’t believe me? Well, I am glad you are here! This week, we will be converting a Python program I wrote and documented here to its C++ counterpart.

At the end of this post, I hope to have concrete evidence of just how much faster C++ can be! Let’s dive in!

Measurement

First, we must take measurements on the portion of code we will be converting. If you haven’t read the original article here, I recommend you give it a peek. Here’s a snippet of how the original program generates mazes:

for i in range(args.num_to_generate):
    maze = WordMaze(args.word, args.grid_width, args.grid_height, args.pixel_width, args.pixel_height, args=args)
    maze.save_image(args.filename.split(".")[0] + "_" + str(i) + ".png")

We will be timing the maze generation and the image storage. To do this, we will utilize the following generator function for a set of input values:

def test_maze_inputs():
    words = ["Hello", "Random", "Testing", "Maze", "Generation"]
    grid_widths = [(20, 20), (40, 40), (80, 80), (160, 160)]

    for word in words:
        for width in grid_widths:
            yield argparse.Namespace(word=word, grid_width=width[0], grid_height=width[1], pixel_width=20, pixel_height=20)

We will be building mazes at the sizes of 20 x 20, 40 x 40, 80 x 80, and 160 x 160. These will be pixel sizes 400 x 400, 800 x 800, 1600 x 1600, and 3200 x 3200, respectively. The values chosen are arbitrary and all runs will be done on the same system.

Below is the timing test run code:

if __name__ == "__main__":
    # args, _ = parseargs()
    maze_generation_times = []
    image_save_times = []
    for test_input in test_maze_inputs():
        print(f"Testing Inputs: {test_input}")

        # Time Maze Generation
        start_time = time.perf_counter()
        maze = WordMaze(test_input.word, test_input.grid_width, test_input.grid_height, test_input.pixel_width, test_input.pixel_height)
        stop_time = time.perf_counter()
        total_time = stop_time - start_time
        maze_generation_times.append(total_time)

        # Time Maze Saving
        start_time = time.perf_counter()
        maze.save_image("maze.png")
        stop_time = time.perf_counter()
        total_time = stop_time - start_time
        image_save_times.append(total_time)
    # Print our table
    print("CSV:\n")
    header_printed = False
    
    for ti, test_input in enumerate(test_maze_inputs()):
        if not header_printed:
            for key in test_input.__dict__:
                print(key, end=",")
            print("Generation Time,Image Save Time,")
            header_printed = True
        
        for key in test_input.__dict__:
            print(test_input.__dict__[key], end=",")
        print(f"{maze_generation_times[ti]},{image_save_times[ti]},")

It will print out a nice CSV-style output for us to curate in Excel:

All right, we have our times to beat! Let’s convert some code!

C++ FTW… Maybe

To make this as fair as possible, I will do my best not to adjust the design of the program. I will simply translate it to C++ and run the same benchmarks.

One does not simply translate from Python to C++

Basically Anyone (except Travis)

Here I am a week later, literally. I have a functioning C++ program that generates the same output you would expect to see from the Python script in my previous post.

Oh boy, it even looks like it has a memory leak! Let’s go ahead and try to fix that up really quickly….

DONE! My brain is fried. I am going to simply let the results speak for themselves:

Python vs C++ Results

Python and C++ clearly diverge even at the first data point. On the bar graph, you can see the stark differences by maze size. For the largest maze size in this test (3200 x 3200 pixels), Python was about 4 times slower than C++!

On the scatter plot, you can see the time differences plotted out by actual square pixels. The linear fits of the data are pretty good, with R-squared values slightly exceeding 0.99. The linear fit equations could be useful for estimating time values for other maze sizes that we did not test.

The desired maze size (in square pixels) would be substituted for the ‘x,’ and solving for the ‘y’ would result in a time estimate. You can see that the slope (time divided by square pixels) of the Python algorithm is steeper than C++, further showcasing C++’s efficiency.

Conclusion

This was an excellent learning experience for me. In the process of rewriting my code, I got a close-up look at various inconsistencies and poor algorithmic choices that came along with my first implementation.

I have yet to fix these things, as it was important to me to have an apples-to-apples comparison between the two codebases. If you compare the GitHub repositories (Python vs C++), you will find there are very few design differences between the two.

Thank you for reading this week!

-Travis

Neural Networks Exposed: Beautiful Genetic Algorithms in C++

Genetic algorithm generations help neural networks learn

This week, we are generating feed-forward neural networks which learn through a genetic algorithm. These networks are much easier to understand than their back-propagation counterparts. Please join me on this step-by-step guide for simplistic neural networks as they learn through generations of selection.

Neural Networks

Neural networks can be a difficult subject to dive into during ground-up implementation. Here, I will explain a simple ‘feed-forward’ network, as we do not need back-propagation for this article. Fortunately, back-propagation is the only part of a neural net which requires deep knowledge of calculus.

So, if you fear feeling dumb like me, fret not! Multiplication is all we need!

I chose evolution because it requires less math 😉

Not Charles Darwin

Feed-Forward Neural Networks

A feed-forward neural network only feeds data in one direction across the network.

Showing how neural networks work

The above diagram shows a neural network with 2 inputs, 4 hidden layer neurons, and 2 output neurons. The data flows from left to right. At each step, the data is modified in a parallel fashion by each neuron.

Showing flow of data in neural networks

A neuron simply takes in all of the given inputs, multiplies those inputs by a paired weight stored by the neuron, and sums these products together. The sum of that addition is passed through a sigmoid function, which in turn is the neuron’s output.

Looking back at our neural network diagram, the hidden layer neurons each have 2 inputs. So, each neuron will store 2 weights. Therefore, the hidden layer stores 8 weights in total.

Neural Networks and Genetic Algorithms

The basic rule which governs my algorithm is:

At the start of each generation, select the highest performing agents from the previous generation and combine their neural networks to form the next generation’s agents.

The important factors are 1) how you select the top performers, and 2) how the next generation is stitched together from the previous winners.

Code Time

The code base for this project has a lot of additional functionality that is necessary to facilitate agents’ learning. However, to keep this post short and sweet, I am going to focus on the key areas that make this process work.

The Problem to Solve

The problem which our agents must solve is simple. They must get as close to the goal as possible, given the number of ‘ticks’ in a generation.

Overview

A ‘tick’ is a time step allotted to each agent in which they can choose which direction to move and how far that move will be.

When all ticks have been counted, the program orders the agents by least to greatest distance from the goal. Next, it selects a few agents from the top (closest to goal) to seed our next generation. This process continues until we reach the last configured generation.

Main Loop

    for (int generation = 0; generation < NUM_GEN; generation++)
    {
        // Setup our generations Agents
        if (!closest)
		// Setup our initial generation of agents
        else ...
		// Stitch our next set of agents from those that came closest from the previous generation

        // Run our simulation
        run_sim(*agents, goal);

        // Rank our agents and take the configured number of top performers
        closest = get_closest_agents(*agents, goal, generation, NUM_AGENTS_SELECTED_EACH_GENERATION);

        if (PRINT_GENERATION_PERFORMANCE)
            std::cout << closest->at(0)->distance << "," << mutation_chance << std::endl;
    }

Our main loop is simple:

  1. Set up the generation’s agents.
  2. Run the simulation.
  3. Rank the agents by final distance to the goal.

setting up the neural networks

The first generation is a set of agents. Each agent has a neural network that is initialized with random weight values and contains the following elements:

  • 2 inputs (distance to goal in x- and y-directions)
  • 10 hidden neurons
  • 4 outputs

Two outputs represent the x- and y-distances that the agent wants to move, each between 0-1. The other two outputs dictate whether the direction of movement is positive or negative.

The following code implements agent creation:

for (int agent_i = 0; agent_i < NUM_AGENTS_PER_GEN; agent_i++)
    {

        // Check if we are basing our agents on anything
        if (based_on)
        {
            int choice1 = get_rand_int(0, based_on->size() - 1), choice2 = get_rand_int(0, based_on->size() - 1);
            while (choice2 == choice1)
                choice2 = get_rand_int(0, based_on->size() - 1);
            // Using default mutation delta
            agents->push_back(new Agent(new Position(start_pos->x, start_pos->y), based_on->at(choice1)->agent, based_on->at(choice2)->agent, mt, mutation_chance));
        }
        else
        {
            // Create agents with two sensors, distance from goal x and y
            agents->push_back(new Agent(new Position(start_pos->x, start_pos->y), 2));
        }
    }

Simulation Run

The simulation runs for the number of ticks configured. At each tick, it “asks” the agent for a desired distance and direction, based on the agent’s current distance from the goal.

void run_sim(std::vector<Agent *> agents, Position *goal)
{
    for (int tick = 0; tick < NUM_TICKS_PER_GEN; tick++)
    {
        for (Agent *a : agents)
        {
            // Sense the Agents distance from the goal
            float sensors[2] = {
                (a->positions.back()->x - goal->x) / BOUNDARY_EDGE_LENGTH,
                (a->positions.back()->y - goal->y) / BOUNDARY_EDGE_LENGTH};
            // Ask the Agent what it wants to do
            a->move(sensors);
        }
    }
}

A Few More Specifics

Selecting top performers

It is important to note that the way you select your ‘winners’ greatly determines the behavior of your agents. My program selects winners in the following code:

    std::vector<AgentDistancePair *> *agent_distance_pairs = new std::vector<AgentDistancePair *>;

    for (Agent *a : agents)
        agent_distance_pairs->push_back(new AgentDistancePair(a, get_distance(*(a->positions.back()), *goal)));

The algorithm pairs each agent’s pointer to its respective distance from the goal. This pairing is encapsulated in the ‘AgentDistancePair’ object. Then, the program sorts the vector filled with these pairings using the following function:

bool compare_agent_distance_pair(AgentDistancePair *first, AgentDistancePair *second)
{
    return bool(first->distance < second->distance);
}

Finally, we reduce the number in the list to the number specified by the caller. In other words, how many of the top performers do we want? At the same time, we are being careful to manage our memory properly.

while (agent_distance_pairs->size() > num_to_find)
    {
        delete agent_distance_pairs->back();
        agent_distance_pairs->pop_back();
    }

All together, the whole function looks like this:

std::vector<AgentDistancePair *> *get_closest_agents(std::vector<Agent *> agents, Position *goal, int generation_number, int num_to_find = 2)
{
    std::vector<AgentDistancePair *> *agent_distance_pairs = new std::vector<AgentDistancePair *>;

    for (Agent *a : agents)
        agent_distance_pairs->push_back(new AgentDistancePair(a, get_distance(*(a->positions.back()), *goal)));

    sort(agent_distance_pairs->begin(), agent_distance_pairs->end(), compare_agent_distance_pair);

    if (DRAW_GENERATION_PERFORMANCE && (generation_number % DRAW_EVERY_NTH_GENERATION) == 0 && DRAW_FULL_POPULATION)
        draw_agent_path(agent_distance_pairs, goal, generation_number);

    while (agent_distance_pairs->size() > num_to_find)
    {
        delete agent_distance_pairs->back();
        agent_distance_pairs->pop_back();
    }

    if (DRAW_GENERATION_PERFORMANCE && (generation_number % DRAW_EVERY_NTH_GENERATION) == 0 && !DRAW_FULL_POPULATION)
        draw_agent_path(agent_distance_pairs, goal, generation_number);

    return agent_distance_pairs;
}

Stitching together agents of the past

In a genetic algorithm, the neural networks learn by being stitched with other top performers, sprinkled with a bit of randomness. In order to stitch the weights together, we choose 1 of 3 different methods:

  • EveryOther
  • SingleSplit
  • RandomChoice
EveryOther

‘EveryOther’ method selects each weight of the child agent by alternating evenly between taking parent weights, as shown in the diagram below:

SingleSplit

‘SingleSplit’ method uses a randomly selected weight index as a point where the responsibility for providing weights to the child switches to the other parent. You can see an example of this happening at the ‘Split Line’ in the diagram below:

RandomChoice

‘RandomChoice’ method randomly selects weights from both parents as it builds the child weights. No diagram needed ;).

Conclusion

This is Generation #5 of a ‘SingleSplit’ method simulation. The green dot is the selected agent of the generation and the yellow dot is its goal!

This has been a huge learning experience for me! I will definitely be adding tiny brains to many things in the future.

If you are interested in exploring or cloning this code, you can access it on my Github here. Thank you for reading! I hope you found this helpful and thought-provoking! As always, feel free to leave a comment or suggestion below.

Further reading on my work in simulations is available at these links:

SFML And C++ Empower The Life Of A Virtual Forest

This week, I am exploring a simple application that simulates tree propagation using SFML and C++. Join me as I build a 2-D forest with a keyboard!

This is the first of a series of posts I am releasing about software simulations in C++ using SFML. My work on simulations is an ever-growing exploration and has only recently taken a turn into the land of C++.

The Basics (Dynamic SFML graphics)

I have a background in C, but up until now, my acumen in C’s object-oriented relative (C++) has been limited to “Hello World.” So, I was quite nervous to translate my usual techniques for Object Oriented Programming (OOP) in Python to my new adventures in C++.

I started off by setting up my environment to be able to compile and link against the SFML library (you can read more details here).

With the environment ready, I worked on drawing to the screen. I created a program which randomly spawns yellow dots (“people”) with an area that slowly spawns green dots (“forest”):

SFML graphics in C++ depicting generation of people (yellow dots) and trees (green dots)

This is a pretty basic start, but I felt good about being able to draw something. Each of these dots represents an object, and some objects inherit from a parent. Thus, I developed basic building blocks for a more complex simulation.

Problem

There was only one problem with the above simulation: Let’s just say…it really leaves a lot to the imagination. Actually, maybe that’s what this should be about…exercising your imagination. It sure would make a programmer’s job a lot easier…

Grids… Structure… Yeah!

I created a simple 2-D array of pointers of type ‘GridCell’. To make things easy, any object that exists in the grid must inherit from it.

Simulation Objects (with Delegated SFML Drawing)

There are three simulation objects:

  • SeedCell (Introduced in tag v0.2)
  • LeafCell (Introduced in tag v0.1)
  • TrunkCell (Introduced in tag v0.1)

Progressive Overview

Everything starts from a seed. The SeedCell has a chance of “becoming” a TrunkCell. A TrunkCell, once fully matured, will grow LeafCells in its four compass directions.

I decided to allow the LeafCells to propagate a little further than a single grid block away from the TrunkCell before dropping more SeedCells.

GridCell

Remember, everything has to inherit from the GridCell class if it wants to exist in our grid.

class GridCell : public Cell{
    protected:
    sf::RectangleShape shape;
    uint32_t tick_born;
    int grid_height, grid_width;
    public:
    GridCell(sf::Vector2f pos, sf::RenderWindow *window, uint32_t tick_born): Cell(pos, window), grid_height(config::grid_edge_size), grid_width(config::grid_edge_size), tick_born(tick_born){}

    sf::Vector2f get_pos_in_direction(Directions in_direction){
        if (in_direction == all){
            return sf::Vector2f(-1,-1);
        }

        if (in_direction == north){
            return sf::Vector2f(pos.x, (pos.y > 0) ? (pos.y - 1) : grid_height - 1);
        }
        if (in_direction == east){
            return sf::Vector2f((pos.x + 1) < grid_width ? pos.x + 1 : 0, pos.y);
        }
        if (in_direction == south){
            return sf::Vector2f(pos.x, (pos.y + 1) < grid_height ? pos.y + 1: 0);
        }
        if (in_direction == west){
            return sf::Vector2f((pos.x > 0) ? (pos.x - 1) : grid_width - 1, pos.y);
        }
    }
    virtual void step(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        // Take actions here
    }
    virtual void draw(){
        // Draw Here
    }
};

You might notice this also inherits from a Cell class. This simple class is actually an artifact of an earlier experiment. Eventually, I will find time to join these objects together again, but at the time of this publishing they are still [sadly] separate.

In the chunk of code above, notice the step function’s signature. When the world ticks forward in time, it hands all game objects a pointer to the pointer of the object that may (or may not) exist in all of its immediate compass neighbors.

As we will see in the TrunkCell, this is useful for the technique I used to propagate objects on the grid.

SeedCell

This simple little object brings so much more if it is chosen to sprout! Below is all it takes to be a ‘seed’ in the simulation.

class SeedCell: public GridCell{
    public:
    using GridCell::GridCell;
};

This is the only object that is manipulated by the grid. It must be controlled this way because of the inheritance structure.

TrunkCell

If the SeedCell ‘transforms,’ it becomes a TrunkCell which matures over time. By ‘transform,’ I mean that if the grid randomly generates a boolean value of True, then the program deletes the ‘seed’ object and replaces it with a TrunkCell.

class TrunkCell: public GridCell{
    float density = 0;
    float current_maturity = 0;
    float total_time_alive = 0;
    int maturity_time = 10;
    int current_season = 0;
    int past_season = 0;
    int season_allowable_new_growth = 1;
    public:
    using GridCell::GridCell;
    void draw(){
        float x_offset = (area.x - (area.x * density)) / 2;
        float y_offset = (area.y - (area.y * density)) / 2;


        shape.setPosition(sf::Vector2f((pos.x * area.x) + x_offset, (pos.y * area.y) + y_offset));
        shape.setSize(area * density);
        shape.setFillColor(sf::Color(0x96, 0x4B, 0));
        window->draw(shape);
    }
    void step(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        if (density >= 1){
            if(!*N){
                *N = new LeafCell(get_pos_in_direction(north), window, 3, &season_allowable_new_growth, current_tick);
            }
            if(!*E){
                *E = new LeafCell(get_pos_in_direction(east), window, 3, &season_allowable_new_growth, current_tick);
            }
            if(!*S){
                *S = new LeafCell(get_pos_in_direction(south), window, 3, &season_allowable_new_growth, current_tick);
            }
            if(!*W){
                *W = new LeafCell(get_pos_in_direction(west), window, 3, &season_allowable_new_growth, current_tick);
            }
        }
        if (density < 1){
            density = ((float) current_tick - tick_born) / maturity_time;
        }else{
            density = 1;
        }

        current_season = (current_tick - tick_born) / maturity_time;

        if (current_season != past_season){
            season_allowable_new_growth = 1;
            past_season = current_season;
        }
    }
};

When the TrunkCell reaches full maturity, it spawns LeafCells at each compass direction around itself, if the positions in the grid are empty. To do this, it sets the pointer reference to a new object that is spawned into the grid. This only works because of the assumption that the propagation is happening in a single execution thread.

LeafCell

The LeafCells have a propagation-level counter and an integer pointer assigned by the originating TrunkCell. The propagation-level counter tracks the distance from the originating TrunkCell.

class LeafCell: public GridCell{
    float density = 0;

    int current_life_cycle = 0;

    int total_life_cycle = 10;
    int ripe_time = 2;
    int old_time = 6;
    int *new_growth_count;
    int propagation_level;

    sf::Color fill;
    sf::Color ripe;
    sf::Color old;
    public:
    LeafCell(sf::Vector2f pos, sf::RenderWindow *window, int propogation_level, int *new_growth_count, uint32_t tick_born) : GridCell(pos, window, tick_born), ripe(0, 255, 0), old(218, 165, 32), propagation_level(propogation_level), new_growth_count(new_growth_count){
        shape.setPosition(sf::Vector2f(pos.x * area.x, pos.y * area.y));
        shape.setSize(area);
    }
    void draw(){
        shape.setFillColor(fill);
        window->draw(shape);
    }
    void propagate_leaves(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        if((*N) && (*E) && (*S) && (*W)) return;
        if (propagation_level <= 0) return;

        bool prop_n = get_rand_bool(0.25);
        bool prop_e = get_rand_bool(0.25);
        bool prop_s = get_rand_bool(0.25);
        bool prop_w = get_rand_bool(0.25);

        if(!(*N) && *new_growth_count > 0 && prop_n) {
            *N = new LeafCell(get_pos_in_direction(north), window, propagation_level - 1, new_growth_count, current_tick);
            (*new_growth_count)--;
        }
        if(!(*E) && *new_growth_count > 0 && prop_e) {
            *E = new LeafCell(get_pos_in_direction(east), window, propagation_level - 1, new_growth_count, current_tick);
            (*new_growth_count)--;
        }
        if(!(*S) && *new_growth_count > 0 && prop_s){
            *S = new LeafCell(get_pos_in_direction(south), window, propagation_level - 1, new_growth_count, current_tick);
            (*new_growth_count)--;
        } 
        if(!(*W) && *new_growth_count > 0 && prop_w) {
            *W = new LeafCell(get_pos_in_direction(west), window, propagation_level - 1, new_growth_count, current_tick);
            (*new_growth_count)--;
        }
    }

    void propagate_seed(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        if((*N) && (*E) && (*S) && (*W)) return;
        if (propagation_level > 0) return;

        bool prop_n = get_rand_bool(0.01);
        bool prop_e = get_rand_bool(0.01);
        bool prop_s = get_rand_bool(0.01);
        bool prop_w = get_rand_bool(0.01);

        if(!(*N) && *new_growth_count > 0 && prop_n) {
            *N = new SeedCell(get_pos_in_direction(north), window, current_tick);
            (*new_growth_count)--;
        }
        if(!(*E) && *new_growth_count > 0 && prop_e) {
            *E = new SeedCell(get_pos_in_direction(north), window, current_tick);
            (*new_growth_count)--;
        }
        if(!(*S) && *new_growth_count > 0 && prop_s){
            *S = new SeedCell(get_pos_in_direction(north), window, current_tick);
            (*new_growth_count)--;
        } 
        if(!(*W) && *new_growth_count > 0 && prop_w) {
            *W = new SeedCell(get_pos_in_direction(north), window, current_tick);
            (*new_growth_count)--;
        }
    }

    void step(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        current_life_cycle = (current_tick - tick_born) % total_life_cycle;

        if(current_life_cycle <= ripe_time){
            if (current_life_cycle == 0 && (current_tick - tick_born) > 0){
                propagate_leaves(N, E, S, W, current_tick);
                propagate_seed(N, E, S, W, current_tick);
            }
            density = current_life_cycle / (float) ripe_time;
            fill = ripe;
        }else if (current_life_cycle > ripe_time && current_life_cycle <= old_time){
            int total_steps = old_time - ripe_time;
            int current_step = current_life_cycle - ripe_time;

            fill.r = ripe.r + (current_step * ((old.r - ripe.r) / total_steps));
            fill.g = ripe.g + (current_step * ((old.g - ripe.g) / total_steps));
            fill.b = ripe.b + (current_step * ((old.b - ripe.b) / total_steps));
        }else {
            fill = old;
            density = ((float) total_life_cycle - current_life_cycle) / (total_life_cycle - old_time);
        }
        
        if(density > 1){
            density = 1;
        }
        fill.a = 255 * density;
    }
};

As a secondary constraint, there is a ‘season_allowable_new_growth’ integer pointer to slow the propagation of LeafCells. This pointer is controlled by the TrunkCell to allow only a certain number of LeafCells to be spawned each season.

Once the propagation-level value reaches zero, a LeafCell will randomly drop SeedCells in the empty compass directions around itself.

Current State

You can check out the simulation’s behavior in its current form in the following YouTube demo:

The repository is currently at tag v0.3 at the time of publishing. If you want to recreate this for yourself, feel free to clone and check out that tag!

Thank you for reading 🙂 If you have any feedback or suggestions, please comment below!

Informative Simulation is More Politics than Money

Kivy plots for a simulation of populations, moving trends, and financial flows.

This week, I wanted to explore a non-zero-sum, agent-based simulation with political platforms and town populations. Given a set of towns with unique political platforms and individuals with political preferences who vote and can move, do we ever see a situation where all individuals are happy?


This is a highly simplistic model that does not conclusively show that free markets are non-zero-sum. Also, I’m definitely not a economics major!

Travis

Hurry! My Time is Limited!

If you are in a hurry, here’s the break-down:

  • Metric choice matters when inspecting a market to find causation.
  • Money is a ‘grease’ for personal-value-based choice.
  • I still love simulations!!!

Metric Choice Matters in the Simulation

While building this simulation, I had the opportunity to install metric collection for a variety of components involved in town voting, market transactions, and population movement. This was required for some debugging to understand why things might not be going as expected. In some cases, it was because my intuition was off. Other times, it was a legitimate bug in my code.

There are definitely people left ‘unhappy’ in this model. Optimization occurs when everyone settles where they are happiest (or the least unhappy).

In this case, each person ends up in a place that for them isn’t worth leave, but they may still be unhappy with their town’s politics. In other words, if there were a town with politics which aligned better with their values, they would move but no such town exists.

Money is the Grease

For me, this project really solidified a different thought process regarding money. I have always seen money as something I must attain from a corporation in order to purchase goods for myself and my family. Building this model has shown me that there is another way of viewing money. Values come first and money is simply a grease to help you attain those values.

Watching this message unfold in the simulation, I realized that I sometimes lack true value. The space is filled with meaningless material purchases. Given a strong enough set of values, you can produce goods which others value. Then, you can’t help but attain money, which incidentally helps you pursue the principles you cherish most.

This thought introduces a question for me. What happens when your values tightly align with the values of the people around you? What can be achieved in such an environment?

I Still Love Writing Simulations

I will be honest…most of the programs I write are not contrived at the outset. In fact, I often almost quit working on them out of embarrassment that there isn’t a clear reason to build. The question, ‘What are you going to get from this?’ creeps in every time. However, after building it, I always have new ‘A-ha!’ moments that feel inherently meaningful. This time was no exception.

Enough of that though! Let’s dig into the specifics of this simulation!

Person/Town Breakdown

Person in the Simulation

Political Preference

In our model, a person has a political ‘base’ preference represented by a binary number. Each bit is a political issue that they either agree with (1) or don’t agree with (0). In addition to this ‘base’ preference, each person is assigned a random ‘Hardheadedness’ value. ‘Hardheadedness’ represents the total number of political preference bits that a person must match with a town’s political preference bits before they cross the happy-unhappy threshold. These ‘Hard Preference Bits’ are selected and set at random during initialization.

Person object in simulation.

In the example above, you can see a person with a ‘Hardheadedness’ score of 3. You can also see their ‘Base Preference Bits,’ and the randomly-selected ‘Hard Preference Bits.’

Happiness Calculation

A person’s happiness is calculated by comparing the political climate (the town’s current political platform), bit by bit, to the person’s base political preference (‘Base Preference Bits’).

If one bit of the town’s political platform aligns with a person’s base preference bit, that 1:1 match adds one to their happiness score. On the other hand, if the town’s political platform bit does not align with the person’s base preference bit, it will only subtract from happiness if this same bit is True (1) for one of their ‘Hard Preference Bits.’ If ‘Hard Preference Bit’ is False (0) for this platform bit, the individual does not care about the mismatch. Let’s look at our example again:

Example of person and person's traits in simulation

Those ‘Hard Preference Bits’ are reflected here in the ‘Base Preference Bits’ as the underlined bits. The underlined bits are the only bits that can affect the happiness score in a negative way if they don’t align with their town’s political platform.

Happiness is capped at the total number of bits in the platform. In this case, a max happiness would be 8. If a person is basically happy, their happiness starts at the platform bit count. Otherwise, it starts at 0.

Deciding to move

A comparison between a person’s happiness and their randomly assigned happiness threshold decides whether they move or stay. The program randomly generates this value between 0.4 and 0.6. Next, the program compares a happiness decimal to this random threshold. The happiness decimal is the happiness score (a sum of 1-8, described in the “Happiness Calculation” section) divided by the number of total happiness bits (which is currently 8 total bits). This division produces a decimal between 0-1. If our happiness decimal is greater than the random threshold between 0.4 and 0.6, the person stays in their current town. Otherwise, if the decimal is less than the threshold, then the person wants to move.

Let’s say the person desires to move. Now they must visit all other towns to see which make them happiest. In some cases, even if they wish to move, they are still happiest where they are. So, they stay!

Remember, just because the person wants to move doesn’t mean they can. A secondary barrier becomes important: Money! There is a cost to moving. A person can only move if they can pay the price. We will explore this later in the explanation of the Town object.

Town in the Simulation

High-Level

A town is responsible for:

  • Conducting votes on the current political platform.
  • Tracking the population’s seniority hierarchy.
  • Keeping the town bank, taxing, and paying people.

Conducting Votes

When the program conducts a vote, it iterates all persons in the towns population to add or subtract a value for each bit in the political platform (see the explanation of a person’s political preference in the sections entitled “Political Preference” and “Happiness Calculation”). The majority vote count result for a political platform bit (positive or negative) sets the same bit in the town’s platform to 0 or 1.

Tracking Seniority

At instantiation, the original persons in a population of a town receive randomly assigned seniority values between 0 and 1. This value represents the amount of time a person has spent in a specific town. For each step of the world, a person’s seniority factors into their wealth. As time passes, the seniority value increases slowly.

Keeping the bank

A town starts off with a specific amount of money in its bank. The bank slowly distributes money to its population based on seniority. A town also uses taxes and utility costs to keep money liquid and cyclic.

The program calculates taxes by multiplying a person’s annual income times the town’s income tax rate (a flat tax). It likewise calculates utility costs by finding the product of the average wealth of the town and a flat tax rate, then by multiplying this product by the individual’s seniority score. The idea is that if you lived here longer, you should pay more. Perhaps I should invert this idea…I’m still not quite sure (let me know your thoughts in the comments below if you think this is fair/unfair!).

The World

The ‘world’ is at the highest level. It is the overarching scheduler for towns and people. Its two chief responsibilities are:

  • Managing the market for moving (including money transfers).
  • Directing town voting cadence.

Managing a Market in the Simulation

A supply- and demand-model tracks the market for moving. The calculation for demand takes the number of people who want to move to the town and subtracts that value from the number of people who want to move out. The program divides this result by the total number of people moving to obtain a ‘cost adjustment’ value from a ‘base moving cost.’ This model considers supply infinite, as the towns do not have a population cap.

Voting Cadence

Every town votes at the same periodic time. The program checks this against each step of the world. The newly voted-in platform may cause some persons to move. This new paradigm may also make people who just moved into a new town unhappy.

Dynamic Interface Orientation

  1. Plot 1 – Each town’s current population
  2. Plot 2 – Total persons moved at each time step
  3. Plot 3 – Total persons desiring to move
  4. Plot 4 – Total wealth of each town
  5. Plot 5 – Current cost to move to each town (Note: 50 = the ‘floor’ to this cost)

Observations

  1. Here, there are two towns that are valued higher than the other towns at the start. In other words, more people want to move in than want to move out.
  2. When people start moving, the populations of their former towns drop and the population of their new town increases.
  3. Here, wealth is tightly coupled with incoming populations as they bring money with them.
  4. Finally, there is a steady decline of people wanting to move when they can either afford to move or demand falls to a level where the cost is affordable.

Conclusion

It was super fun to write this simulation! This exercise developed into something more than just a technical exercise. I learned some economic and behavioral concepts regarding personal values and monetary impact.

I hope you enjoyed this week’s post. Thank you for reading it! Feel free to leave a comment correcting me. As mentioned earlier, I am not an economist :). If you want to see the code that drove this post, you can find it here on my GitHub (warning, it’s a bit messy).

If you’d like to check out some of my previous work on simulations, then please go to these posts:

To check out the Python GUI platform I used for the plots in this article, please check out Kivy here.

Merry Christmas and Happy New Year!

Kivy, One of the Best and Powerful GUI Creators

Simulation-building is something I seem to return to over and over again. I really enjoy learning something new each time I witness how small behaviors produce emergent behavior. Graphics are an accessible way to conceptualize the complex nature of multi-variable simulations. In this article, I lay the foundation for a project I hope will serve as the first chapter for a much larger vision.

Let’s explore Kivy, a powerful Python app development library.

Spoiler alert, this has, by far, been my best experience with a Graphical User Interface (GUI) library in Python. For me, it provides the best of two worlds for building a small application for a very basic simulation:

  • Layout management and simple widgets for boiler-plate application objects, like buttons and text inputs.
  • Custom drawing to the widgets’ underlying canvas, allowing for custom shapes and graphics.

A Very Basic Simulation

To get a better feel for this GUI framework, I decided to implement a primitive simulation to display on my application. The rules are simple: You have people and you have food! A person (red square) has a stomach and energy. The person’s stomach is filled with food, and energy is created from the food in their stomach. Food (green square) is a stationary source for a person to fill their stomach.

People move around in search for stationary food supplies, which expends energy and in turn uses the food in their stomach. If a person’s energy and stomach level reach 0, they are no longer considered alive (blue square).

Kivy

All right, with the simulation explained, it’s time to dig into Kivy! Kivy gives you so much to play with, yet has a refreshingly minimal barrier to entry. You can add buttons, text boxes, labels, and so much more! The best part is, through the Kv Language, your GUI can function on its own without Python code driving it. There is a helpful rundown on the Kv language on Kivy’s website here.

Kv Design Language

An example of this can be found in my ‘SimulationController’ widget, defined in SimulationGUI.kv below:

<SimulationController>
    BoxLayout:
        width: root.width
        height: root.height
        orientation: "horizontal"
        BoxLayout:
            orientation: "vertical"

            Slider:
                id: grid_height
                min: 25
                max: int(root.height)
                step: 1
                orientation: "horizontal"
            Label:
                text: str(grid_height.value) + " Map Height"
            
            Slider:
                id: grid_width
                min: 25
                max: int(root.width)
                step: 1
                orientation: "horizontal"
            Label:
                text: str(grid_width.value) + " Map Width"

You can see the ‘SimulationController,’ with two nested ‘BoxLayouts.’

<SimulationController>
    BoxLayout:
        width: root.width
        height: root.height
        orientation: "horizontal"
        BoxLayout:
            orientation: "vertical"

On the outermost ‘BoxLayout,’ I define its width and height to fill the space allotted to it by the element into which it is inserted. It took me a while to wrap my head around this as I attempted bigger layouts. Also note the orientation, which describes the direction things will be stacked.

Inside the innermost ‘BoxLayout,’ you can see Slider and Label pairs:

            Slider:
                id: grid_height
                min: 25
                max: int(root.height)
                step: 1
                orientation: "horizontal"
            Label:
                text: str(grid_height.value) + " Map Height"
            
            Slider:
                id: grid_width
                min: 25
                max: int(root.width)
                step: 1
                orientation: "horizontal"
            Label:
                text: str(grid_width.value) + " Map Width"

The Slider and Label are GUI Widgets built into the Kivy library. For the Slider, I set an ID value (basically a name referenced elsewhere in the code), along with a minimum and maximum value. In the Label Widget, you can see the GUI driving itself a bit – the text of the Label is set by the value currently selected by the Slider. This might be a commonplace functionality for a GUI framework that I just haven’t discovered before, but I really like it. Everything looks a lot cleaner when simple functionalities like this are independent from the Python code.

Invoking Python code directly from this language is pretty easy, too.

        Button:
            id: start_but
            on_press: app.run_sim(root.ids.num_people.value, root.ids.num_food.value, root.ids.grid_height.value, root.ids.grid_width.value)
            text: "Start Simulation"

In the above example, you can see a direct reference to ‘app’ and a call to ‘run_sim.’ This is defined on the Python side in my ‘SimulationGUIApp’ object which inherits from the ‘App’ object from the Kivy library.

The reference to ‘run_sim’ uses information from our widget and calls the function with the values collected as parameters. Awesome!

Kivy and Python

For me, it is often difficult to understand the interface between a design language and a program language (I’m looking at you, Javascript and CSS!). Let’s cover some basics.

Make Kv Definitions Python reference-able

First thing: How do you let Kivy know you are using a Kv file? There are a few ways to do this, but the one I chose was to name the .Kv file the same thing as my app’s object name (e.g., Kv filename is ‘SimulationGUI.kv’ and my Python object is ‘SimulationGUIApp’). An important detail here is to note that my Python object name ends with ‘App,’ while the filename doesn’t.

As for the Widgets defined in the .Kv file, all we have to do to use them on the Python side is create an empty class with the same name, as shown below:

.Kv Definition
<SimulationController>
    BoxLayout:
        width: root.width
        height: root.height
        orientation: "horizontal"
Python Definition
class SimulationController(Widget):
    pass

Even though the class is empty, the layout and widgets are defined in the .Kv file and will be shown when the app loads.

Once you do that, you can add Widgets to an App, creating your interface. In the example below, class ‘MainScreen’ inherits from ‘BoxLayout’ and then adds two Widgets to itself. One of the widgets is our familiar ‘SimulationController.’ Next, a ‘MainScreen’ object is instantiated and added to the app class (‘SimulationGUIApp’) in its build method.

class SimulationViewport(Widget):
    pass

class SimulationController(Widget):
    pass
        
class MainScreen(BoxLayout):
    def __init__(self, **kwargs):
        super(MainScreen, self).__init__(**kwargs)
        self.orientation = "vertical"
        self.size_hint = (1.,1.)

        self.sim_view = SimulationViewport()
        self.sim_cont = SimulationController()

        print(self.sim_cont.ids["grid_height"].value)
        
        self.add_widget(self.sim_view)
        self.add_widget(self.sim_cont)

class SimulationGUIApp(App):
    def build(self):
        main_screen = MainScreen()

        self.sim_view = main_screen.sim_view

        return main_screen

I decided to do it this way so I could access the canvas of our ‘SimulationViewport’ for drawing people and food. Also, in the above snippet, I left in a line of code to show how you can reference the values of the Widgets contained in our Kv-defined widgets using the ‘ids’ dictionary.

A Slightly More Advanced Kivy Concept

Until now, we have focused on the basic uses of Kivy. Now, let’s dive into how to draw on Widgets so that we can see our people as they wander around our world looking for food sources.

Assumptions Kill Me

One frustrating mistake I made was when I assumed that my drawing would be relative to the corner of my Widget. In a multi-widget layout, this would mean that the position of each drawing would be local to its own widget. In reality, the drawings are positioned within the global window’s coordinate grid, regardless of widget. With this mistaken assumption, I thought that the Widgets were overlapping one another. I likely developed this intuition from HTML and CSS, where the flow layout can cause elements to exist over one another.

Finally, I realized that drawing occurs from the window’s origin. This is really simple to do, but figuring it out was difficult. Below is the code snippet that saved me, particularly the self.sim_view.pos[0] and [1].

pos_screen_x = x * block_width + self.sim_view.pos[0]
pos_screen_y = y * block_height + self.sim_view.pos[1]

Rectangle(pos=(pos_screen_x, pos_screen_y), size=(block_width, block_height))

A ‘Rectangle’ is drawn at the x, y coordinates of our Person or Food with an offset of ‘self.sim_view.pos[0]’ and ‘[1],’ which gets it to our desired spot. The challenging part, however, is that you can draw outside the widget and it will still be rendered if it is within the window’s region.

So now we are drawing in the right place, but how do we select a color?

It’s like painting

Right above the lines of code where we select the x and y coordinates, there is an if-else block where the color palette is chosen for the upcoming ‘Rectangle’ draw. This color choice selects a value between 0 and 1 for each color (Red/Green/Blue). Here, we check if our object is ‘Food’ or ‘Person.’ Then, we select red or green of varying intensity, based on energy or food levels:

if isinstance(thing, Food):
    color_strength = thing.Amount / thing.MaxAmount
    color_strength = max([color_floor, color_strength])

    Color(0.,color_strength,0.)
elif isinstance(thing, Person):
    color_strength = thing.Energy / thing.EnergyMax
    color_strength = max(color_floor, color_strength)
    if thing.alive:
        Color(color_strength,0.,0.)
    else:
        Color(0.,0.,1.)

Conclusion

This has been a wonderful learning experience for me, and I hope by posting it here it can be useful to you, too! I look forward to possibly building out the simulation and controls with more options and depth.

If you are interested in further reading on my work in the simulation building realm, check out my wandering development post here, where I attempt to build a balanced simulation of a simple model for populations and reproduction.

There is also an excellent website with many models much better than mine on a website called NetLogo. I found this gem while doing research for my simulation (I also bought a book, Complex Adaptive Systems: An Introduction to Computational Models of Social Life, to add to my collection of unread books!).

Thank you so much for sticking with me, I hope the coming week receives you well 🙂

Code

Get the code for my simulation at my GitHub.

Spelling made exciting in a labyrinthine maze!

Last week, my son Isaac was crushed by his 3rd-grade spelling test. For an 8-year-old, it can be difficult to study something that might seem opaque in its usefulness. So, with a bit of Python magic, let us engage the children in solving a spelling maze!

The idea is to accept the input of a word, then generate a maze where the path to success is made easier if you know the next letter of the word.

Just Passing Through (TL;DR)

I figured I would need the ability to 1) generate a maze, 2) find all the junction points of that maze, and 3) draw letters at the junctions which lead the player down the correct route.

I built infrastructure and foundational code to perform functions like determining block positions, drawing the maze, and cleaning up relationships between blocks. Separately, I created a Maze class, which handled the maze-generation algorithm. I treated this Maze class as a parent to the WordMaze class which handles many of the functions related to letter placement. This helped de-clutter the Maze class and extend its functionality.

At the end of my project journey, I developed the code to apply words in the maze. This part became pretty complex, as I needed many parameters to be met so that the letters would group properly and the possible “incorrect” letter junctions along the solution route would be challenging enough. There were bugs to iron out, but eventually the code worked. Even better, soon a kid will be learning his spelling list and having fun at the same time!

Staying for a While (Technical Peeps)

The code for this project can be found at my GitHub.

Infrastructure and Foundational Code

After writing a toy maze generator, it was clear that I would need to build some foundations to help reduce the top-level complexity of the maze generation code. The three main foundational classes were:

  • A Map Class
  • A Draw Class
  • A Path Class

Foundational Code: Map Class

The map class handles the grid of blocks to represent a player’s position at any point in the maze. It performs the following functions:

  • Block relationships, including entry and exit directions
  • Block positions, including x-y coordinate grid

Foundational Code: Draw Class

The draw class (‘Drawable2D’ in the code) is responsible for drawing to a grid of pixels, as defined by the object itself. This is an inheritable class allowing an object to set its pixel width and height and call functions like ‘fill,’ ‘draw_portion,’ or ‘draw_edge.’

The object inheriting from this class must implement a ‘draw’ method instructing how to draw itself.

Foundational Code: Path Class

This class is responsible primarily for generating random paths using the ‘step_path’ function. By moving this logic to its own class, we radically reduce the complexity of generating the maze.

Now that we have a high level view of the classes involved, let’s get into the actual maze generation!

Maze Generation

The code is available at my GitHub, but I will include the chunks of code I am referencing as we go along. For this section we will look at the ‘generate_maze’ function in the Maze class, as shown below:

def generate_maze(self):
    # Setup our start and end blocks
    self.map_start = self.map.get_start_block()
    self.map_end = None

    # We always go from North to South
    self.map_start.entry_direction = GridDirection.North

    # Setup our temporary algorithm variable
    new_start = self.map_start 

    # Start our Algorithm
    while not self.map_end:
        # Generate random paths from our starting block
        self.generate_paths(new_start)
        
        # Get the lowest block and set an exit direction (South)
        y, lowest_block = self.get_lowest_block()
        lowest_block.exit_directions.add(GridDirection.South)

        # If we have hit the bottom of the play area make this our end block
        if y == self.map.grid_height - 1:
            self.map_end = lowest_block
        # If we haven't hit the bottom then make this our next start block for random paths
        else:
            new_start = self.map.get_block_in_direction(lowest_block, GridDirection.South)
            new_start.entry_direction = GridDirection.North

    # Draw the final maze
    self.map.draw()

Firstly, we only make Mazes that start at the top and end at the bottom. With this basic tenet, we can say that we aren’t done building the maze until we have at least one block touching the southern edge of the grid. This is why our main while loop is checking for a ‘map_end’ block. We will be iterating the algorithm steps until we see such a block.

So first we grab the map’s start block, set that as our ‘new_start’ and generate paths from it. Once we have generated the paths for ‘new_start,’ we get the lowest block explored on the grid. If this block is touching the southern edge of the grid, we are done! If it isn’t touching the southern edge, then we set the southern edge as an exit. Next, we get the block south of this block and set it as ‘new_start’ and let the algorithm run again.

Generate Paths

The ‘generate_paths’ function (code below) is a random depth-first search algorithm that will generate random exits as it goes on its current search.

def generate_paths(self, start_block: Block):
    # Setup our Algorithm temporary variables
    possible_path_starts = [start_block]

    # While we have possible path starts continue building
    while possible_path_starts:
        # Randomly choose our next path start and remove it from the list
        current_start = random.choice(possible_path_starts)
        possible_path_starts.remove(current_start)

        # If this block is already explored move on
        if current_start.explored:
            continue
        
        # Otherwise create a new path and step it until it is finished
        path = Path(self.map, current_start)
        while not path.complete:
            possible_path_starts.extend(path.step_path())

Below is a GIF of how this algorithm looks when it is running.

This is what the path generation algorithm looks like while it runs.

It will eventually produce a maze like this:

Randomly generated maze paths

All right, so we have a maze generated! Now how do we get our word in there to guide our kid to the spelling solution?

Putting Spelling Words in the Maze

With perfect 20/20 hindsight, I probably would have included exit count in path creation when I was working on the maze generator. I hadn’t considered how to control the number of exits on the solution path until I tried applying a word to it. The problem is our solution path must have exactly the same number of junctions as there are letters in our target word. This means if we have too many junctions on our solution path, we must do something to remove them.

I had a few ideas on how to do this with simple algorithms that I thought for sure would do exactly what I wanted them to do… None of them worked for various reasons, largely due to bugs in the foundational code.

I created multiple band-aids to mitigate these problems. If the band-aids stacked too high, I would eventually sift the code down to where it belonged. However, I will warn the reader, none of the code that I submit here is pretty. This was one of those “quickly written/get the thing working!” kind of projects. Should it be deemed for long term use, I think a rewrite would be warranted and easily done. Anyway, on to the method that I chose!

Apply Word

The ‘apply_word’ method orchestrates all the steps in adding letters to our maze and is included in a child class of the Maze class called… WordMaze.

def apply_word(self):
    exit_count = len(self.word)
    solution_junctions = self.get_solution_path_junctions()
    
    if len(solution_junctions) < exit_count - 1:
        raise IndexError("Not enough exit points to write word!")
    
    # Randomly select the junctions to be used for our word path
    selected_junctions = self.select_solution_path_junctions(solution_junctions)

    # Close off and unexplore all other junctions
    self.close_all_solution_path_junctions(solution_junctions, selected_junctions)

    # Reopen paths for unexplored areas on valid routes
    self.fill_out_unexplored_areas()
    
    # Apply letters to junctions
    self.apply_letters_to_junctions()

Put simply, after we have generated a maze using our parent class, we find all junctions in the map and select ones that are located along the solution path. From this smaller set, we randomly choose the junctions we want to use for the letters of our word (this shortens the list further).

With our solution letter junctions selected, we need to get rid of all other junctions along the solution path. So, we close off their entry and “un-explore” them, meaning we reset them to the original state they held before we applied any path algorithm to them.

At this point we have a pretty boring maze, and it likely is pretty easy to distinguish which path is the right one because we “un-explored” all but the ones chosen for our letters. To increase complexity, we open the dead-ends of our selected paths, exploring them until the dead space is accessible again from the selected junction.

If this is still confusing, stick with me, we have some visual elements below that will hopefully fill any gaps of understanding. First, let’s break down the important methods included in the code block above that follow the steps I just described.

get solution path junctions

This function gets all of the junctions in the map and randomly selects ones that exist on the solution path.

select solution path junctions

Of the set of junctions that exist on the solution path, select only as many as there are letters in our target word. Do this selection at random.

close all solution path junctions

This closes all of our non-selected solution path junctions and marks the blocks in the path closed off from the solution path as not explored.

Fill out unexplored areas

Now that we have a bunch of unexplored territory, let’s re-explore it from the dead ends of the selected paths. This should make the puzzle a bit more…puzzling!

Apply Letters to junctions

Finally, let’s get some letters at the junction exits. This method will iterate all junctions, giving them each a random letter. Then, it will iterate all the solution path blocks in order, applying the correct letter of the word to each. To make things a bit more tricky in the orthogonal direction of the solution path, we put the letter that comes after the correct letter.

Below is a GIF showing the re-exploration of unexplored map area:

Re-explore after culling unused solution path junctions

With re-exploration completed, we can perform letter application. Below is the result of running the following command:

python3 ./main.py --word Discover
A maze with "discover" as the solution spelling word
Can you solve for “discover”? Start at the entrance and find out!

Conclusion

Spelling is difficult for many youngsters my son’s age. I hope that doing puzzles will push it just far enough over the edge of engagement to keep him moving forward. I want to thank you for sticking with me through this post. I hope the ideas and implementation I shared were thought-provoking. If you have any thoughts inspired by this post, please feel free to leave a comment below.

Also, if there is interest, I can push a web version of this application so that you can generate spelling mazes from a webpage instead of having to run it on your local machine.

Thanks for reading!

Spelling Maze Code

The code for this project can be found at my GitHub.

My Related Work

If you find algorithms interesting and would like to read more about another project of mine, check out another interesting algorithm problem from my PaperBoy VR Game.