Travis Adsitt

ChatGPT in an Embedded World: Quickly Making A Quote Display

Image of a construction worker standing on a roof with his hands on his hips, staring out at the treeline. Quote says, "ChatGPT, Build Something Here."

This week, I built a small embedded program with ChatGPT at the proverbial coder’s keyboard. The only coding I did was to correct small mistakes and to integrate the chunks I asked ChatGPT to generate. I am still experimenting with the various ways I can communicate with ChatGPT, so bear with me as I try some new things.

For our project, we will use an ESP32 and an OLED screen to display famous quotes retrieved from the Internet. The design will be roughly like this:

We won’t concern ourselves with smaller details, like how the user might configure the ESP32 to connect to a wireless network. At this point, we are concerned with standing up the ‘core technology’ of our ‘product.’

ESP32 Quote Fetching (Written by ChatGPT)

First things first, we need to get our ESP32 to retrieve quotes from the Internet. I was originally dubious about ChatGPT’s ability to write an application that could pull these quotes, so I started by asking it to write a program in Python. With a Python script, I could verify functionality before adding steps like compilation and flashing to the mix.

After proving out its abilities in Python, I asked ChatGPT to write the same program in C. You can see our conversation below. Notice the back-and-forth nature of the conversation, including corrections and specifications I made to ChatGPT:

Travis: Can you write the same program in C for a WROOM ESP32?

ChatGPT: Yes, I can! Here's an example of how you could implement the same program in C for an ESP32 module using the ESP-IDF framework:

...CODE BLOCK...

Travis: This is for esp32 so needs a loop and setup

ChatGPT: You're right, I apologize for that oversight. Here's an updated version of the program that includes a setup and loop function for the ESP32:

...CODE BLOCK...

Travis: Can you replace the printf with a serial out

ChatGPT: Sure, here's the modified program with serial output instead of printf:

...CODE BLOCK...

Travis: Sorry I meant the arduino serial out

ChatGPT: No problem! Here's the modified program with serial output using the Arduino Serial library:

#include <Arduino.h>
#include <WiFi.h>
#include <HTTPClient.h>
#include <ArduinoJson.h>
#include <stdlib.h>
#include <time.h>

const char* ssid = "your_SSID";
const char* password = "your_PASSWORD";

void fetch_quote(char *quote, char *author) {
  WiFiClient client;
  HTTPClient http;
  http.begin(client, "https://api.quotable.io/random");
  int httpCode = http.GET();
  if (httpCode != HTTP_CODE_OK) {
    Serial.printf("HTTP request failed: %s\n", http.errorToString(httpCode).c_str());
    http.end();
    return;
  }
  String payload = http.getString();
  DynamicJsonDocument doc(1024);
  deserializeJson(doc, payload);
  String content = doc["content"].as<String>();
  String author_item = doc["author"].as<String>();
  content.toCharArray(quote, 256);
  author_item.toCharArray(author, 256);
  http.end();
}

void setup() {
  Serial.begin(115200);
  while (!Serial);
  Serial.println("Serial connected!");
  WiFi.begin(ssid, password);
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.println("Connecting to WiFi...");
  }
  Serial.println("WiFi connected!");
  srand(time(NULL));
}

void loop() {
  char quote[256];
  char author[256];
  fetch_quote(quote, author);
  char output[512];
  sprintf(output, "%s\n- %s\n", quote, author);
  Serial.println(output);
  delay(1000);
}

void setup() {
  Serial.begin(115200);
  while (!Serial);
  Serial.println("Serial connected!");
  WiFi.begin(ssid, password);
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.println("Connecting to WiFi...");
  }
  Serial.println("WiFi connected!");
  srand(time(NULL));
}

void loop() {
  char quote[256];
  char author[256];
  fetch_quote(quote, author);
  char output[512];
  sprintf(output, "%s\n- %s\n", quote, author);
  Serial.println(output);
  delay(1000);
}

After a quick attempt at compiling, I see that I don’t have the ‘ArduinoJson.h’ that ChatGPT wants to use, so I ask how to attain that in the following snippet of the conversation:

Travis: How do I use the ArduinoJson.h?

ChatGPT: ArduinoJson.h is a library that provides a simple and efficient way to parse and generate JSON data on Arduino and other embedded systems. To use it, you'll need to follow these steps:

    Download and install the library. You can do this by going to the "Sketch" menu in the Arduino IDE, selecting "Include Library", and then "Manage Libraries". In the Library Manager, search for "ArduinoJson" and click "Install" to install the latest version.

...MORE INSTRUCTIONS (NOT NEEDED FOR OUR CASE)...

Once I followed ChatGPT’s instruction, I found one final bug: There were two ‘setup’ and ‘loop’ blocks. Once I fixed this simple mistake, I then had a program that could be compiled. The next exciting step would be to flash it to the ESP32 and then see if the serial monitor would start displaying quotes!

Attempt #1 Output

Serial connected!
Connecting to WiFi...
Connecting to WiFi...
Connecting to WiFi...
Connecting to WiFi...
Connecting to WiFi...
WiFi connected!
HTTP request failed: connection lost
xV���k@T@0
- ��@�?

HTTP request failed: connection lost
xV���k@T@0
- ��@�?

...MORE OF THIS...

The ESP32 connected to WiFi. It looks like it attempted an HTTP request, but the connection was lost. Let’s ask ChatGPT what is up…

Travis: I am getting 'HTTP request failed: connection lost'

ChatGPT: The error "HTTP request failed: connection lost" could indicate that the ESP32 is losing its Wi-Fi connection during the HTTP request. This could happen if the Wi-Fi signal is weak or unstable.

To troubleshoot this issue, you could try the following steps:

   1. Make sure that the ESP32 is connected to a stable Wi-Fi network with a strong signal.

   2. Increase the timeout for the HTTP request by calling the setTimeout method on the HTTPClient object. For example:

...CODE BLOCK...

This will give the ESP32 more time to complete the HTTP request before timing out.

    3. Try using a different HTTP client library, such as the WiFiClient library, to make the HTTP request. For example:

...CODE BLOCK...

This code uses the WiFiClient library to make an HTTP GET request to the API server. Note that this code does not use HTTPS, so it is less secure than the previous code that used WiFiClientSecure. However, it may be more reliable in some situations.

I hope this helps! Let me know if you have any other questions.

ChatGPT gave me a list of debug options. Super helpful!

I took ChatGPT’s list out of order and honed in on a small detail in the ‘WiFiClientSecure’ debug option. I noticed in the original code that ChatGPT was accessing the quotable API using an HTTPS connection. I simply removed the ‘s’ from ‘https’ to access it without SSL security and… (drum-roll, continue onto the next section please!)…

Attempt #2 Output, Nicely Done ChatGPT!

Serial connected!
Connecting to WiFi...
Connecting to WiFi...
Connecting to WiFi...
WiFi connected!
Never find fault with the absent.
- Alexander Pope

One machine can do the work of fifty ordinary men. No machine can do the work of one extraordinary man.
- Elbert Hubbard

Real success is finding your lifework in the work that you love.
- David McCullough

A dream is your creative vision for your life in the future. You must break out of your current comfort zone and become comfortable with the unfamiliar and the unknown.

We have quotes!

ESP32 OLED Display Output

All right, now we need to set up the quotes we get from the Internet to display on our OLED screen.

Check out my continued conversation with ChatGPT as I ask for help on the display portion of the project.

Travis: I have a 0.96" OLED screen, are you familiar with them?

ChatGPT: Yes, I'm familiar with 0.96" OLED screens! They are commonly used with microcontrollers and are great for displaying text and graphics in small projects. There are a variety of libraries available for interfacing with OLED screens, so depending on the specific screen you have, there may be different libraries that work best.

If you have any questions about using an OLED screen with your project, feel free to ask!

Travis: I would like to display the fetched quote on the OLED screen instead of serial out

ChatGPT: Sure, here's an example code snippet that shows how you can display the fetched quote on a 0.96" OLED screen using the Adafruit_SSD1306 library:

...CODE BLOCK (See Below)...

In this example code, we're using the Adafruit_SSD1306 library to initialize and interact with the OLED screen. The fetched quote is then displayed on the OLED screen using the display object's println() and setCursor() methods.

Note that you'll need to modify the API_HOST, API_URL, your_SSID, and your_PASSWORD variables to match your specific requirements.

Travis: What pins are used for the OLED screen on the ESP32?

ChatGPT: The pins used for the OLED screen on the ESP32 depend on the specific breakout board you're using. However, many 0.96" OLED screens for the ESP32 use the I2C protocol to communicate with the microcontroller.

Here's an example pinout for the popular ESP32 DevKitC board and an OLED screen using the I2C protocol:

...TABLE (See Below)...

Note that the pin names on your OLED screen may be different depending on the manufacturer, so it's important to consult the documentation for your specific screen to ensure you're using the correct pins. Additionally, you may need to install additional libraries to interface with the OLED screen depending on the specific breakout board you're using.

To summarize our conversation, I asked if ChatGPT was familiar with the OLED screen I was using. Then, I asked if ChatGPT could generate the code to display the quote fetched on the OLED screen. Finally, I asked how I should wire my ESP32 to the OLED screen.

Below you will find the code and pin-out table that ChatGPT provided me. I noticed a couple of things, namely:

ChatGPT has nearly zero deterministic behavior between code generations. Each one was noticeably unique.
Anecdotally, ChatGPT seems to have organized the code better with the added complexity of displaying to the OLED screen.

#include <Wire.h>
#include <Adafruit_GFX.h>
#include <Adafruit_SSD1306.h>
#include <HTTPClient.h>
#include <ArduinoJson.h>

#define SCREEN_WIDTH 128 // OLED display width, in pixels
#define SCREEN_HEIGHT 64 // OLED display height, in pixels
#define OLED_RESET -1    // Reset pin # (or -1 if sharing Arduino reset pin)
#define SCREEN_ADDRESS 0x3C // I2C address of the OLED screen
#define API_HOST "api.quotable.io"
#define API_URL "/random"

Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET);

void setup() {
  Serial.begin(115200);

  // Initialize OLED display
  if(!display.begin(SSD1306_SWITCHCAPVCC, SCREEN_ADDRESS)) {
    Serial.println("SSD1306 allocation failed");
    while (1);
  }

  // Clear the display buffer
  display.clearDisplay();

  // Set text color to white
  display.setTextColor(SSD1306_WHITE);

  // Set font size
  display.setTextSize(1);

  // Connect to Wi-Fi network
  WiFi.begin("your_SSID", "your_PASSWORD");
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.println("Connecting to WiFi...");
  }
  Serial.println("Connected to WiFi");
}

void loop() {
  // Make HTTP request to API
  HTTPClient http;
  http.begin(API_HOST, 80, API_URL);
  int httpCode = http.GET();
  if (httpCode == HTTP_CODE_OK) {
    String payload = http.getString();

    // Parse JSON response
    const size_t bufferSize = JSON_OBJECT_SIZE(2) + JSON_OBJECT_SIZE(3) + 160;
    DynamicJsonDocument doc(bufferSize);
    deserializeJson(doc, payload);
    const char* quote = doc["content"];
    const char* author = doc["author"];

    // Display quote on OLED screen
    display.clearDisplay();
    display.setCursor(0, 0);
    display.println(quote);
    display.setCursor(0, 20);
    display.println("- " + String(author));
    display.display();
  } else {
    Serial.println("HTTP request failed");
  }

  // Wait 10 seconds before making the next request
  delay(10000);
}

ESP32 Pin	OLED Pin
GPIO21	SDA
GPIO22	SCL
3.3V	VCC
GND	GND

Wiring table provided by ChatGPT

Display, Attempt #1 Output

I am blown away by the results of this. It worked! There was only one flaw: When a quote was too long, it wrapped around the screen over the top of the author’s name. This display problem was due to ChatGPT’s use of hard-coded location coordinates. I could see two solutions right out of the gate:

Have the quote scroll across the screen instead of wrapping.
Have the author’s name appended to the quote with a new line just before it.

Either of these would have worked, but both required too much coding for me. So, I asked ChatGPT to do it!

Travis: The text wraps and is colliding with the authors name, how can I resolve this?

ChatGPT: To prevent text from colliding with other elements on your OLED screen, you may need to adjust the font size, spacing, or position of your text.

Here are some suggestions for resolving the text wrapping issue:

List Items 1..2

3. Move the author's name to a different location on the OLED screen, such as below the quote text or in a separate area of the screen. This will create more space for the quote text to be displayed without colliding with other elements.

List Items 4..5

Travis: How would I implement option #3

ChatGPT: To move the author's name to a different location on the OLED screen, you can adjust the Y position of the text using the display.drawString() function. Here's an example of how to move the author's name to be displayed below the quote text:

...CODE BLOCK...

In this example, we first display the quote text using the drawStringMaxWidth() function, which automatically wraps the text based on the width of the OLED screen. We then use the setCursor() function to move the cursor to a new line below the quote text, and display the author's name using the print() function.

Note that you may need to adjust the Y position of the cursor depending on the size of your OLED screen and the amount of spacing you want between the quote text and the author's name.

Travis: How would I dynamically change the authors line to be below the quote text

ChatGPT: If you want to dynamically change the position of the author's name based on the length of the quote text, you can calculate the height of the quote text and adjust the Y position of the author's name accordingly.

Here's an example of how to do this:

...MORE DEBUG (2 More Rounds of Questions)...

Travis: Is there a way to simply get the cursors location after a println?
 
ChatGPT: In Arduino, you can use the getTextBounds() function to get the size of the text that was just printed. You can then use this information to calculate the position of the next line of text.

Here's an example:

...CODE BLOCK...

In this example, we start by printing the first line of text. We then use the getTextBounds() function to get the size of the printed text. The function takes six parameters: the string of text to measure, the x and y coordinates of the cursor, and four output variables that will contain the position and size of the text.

After getting the size of the printed text, we calculate the position of the next line of text by adding some padding between lines. Finally, we print the second line of text at the new cursor position.

ChatGPT’s Solution to the Text Collision

Basically, I asked ChatGPT how to resolve the text collision with the author’s name and it gave me a list of suggestions. I opted for moving the author’s name below the quote on the screen. I then asked ChatGPT to generate the code for me. The new code broke my compile by requiring things that were non-existent in the libraries I have. After a few more questions, I had an example of a way to do it without using the libraries it had suggested. Below is the example that got me “un-stuck”:

int x, y; // x and y coordinates of the cursor

// Print the first line of text
display.println("Hello, world!");

// Get the size of the printed text
int16_t x1, y1;
uint16_t w, h;
display.getTextBounds("Hello, world!", x, y, &x1, &y1, &w, &h);

// Calculate the position of the next line of text
x = x + w + 5; // add some padding between lines
y = y + h;

// Print the second line of text
display.println("This is the second line.");

I couldn’t merely copy and paste the above snippet of ChatGPT’s code into the overall program. Instead, I had to use it as a general example to guide me as I implemented my own solution. As the architect rather than the technical developer, this was almost too much coding work, but it was still helpful for finishing out the quote display sketch. The final code is below:

#include <Wire.h>
#include <Adafruit_GFX.h>
#include <Adafruit_SSD1306.h>
#include <HTTPClient.h>
#include <ArduinoJson.h>

#define SCREEN_WIDTH 128 // OLED display width, in pixels
#define SCREEN_HEIGHT 64 // OLED display height, in pixels
#define OLED_RESET -1    // Reset pin # (or -1 if sharing Arduino reset pin)
#define SCREEN_ADDRESS 0x3C // I2C address of the OLED screen
#define API_HOST "api.quotable.io"
#define API_URL "/random"

Adafruit_SSD1306 display(SCREEN_WIDTH, SCREEN_HEIGHT, &Wire, OLED_RESET);

void setup() {
  Serial.begin(115200);

  // Initialize OLED display
  if(!display.begin(SSD1306_SWITCHCAPVCC, SCREEN_ADDRESS)) {
    Serial.println("SSD1306 allocation failed");
    while (1);
  }

  // Clear the display buffer
  display.clearDisplay();

  // Set text color to white
  display.setTextColor(SSD1306_WHITE);

  // Set font size
  display.setTextSize(1);

  // Connect to Wi-Fi network
  WiFi.begin("your_SSID", "your_PASSWORD");
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
    Serial.println("Connecting to WiFi...");
  }
  Serial.println("Connected to WiFi");
}

void loop() {
  // Make HTTP request to API
  HTTPClient http;
  http.begin(API_HOST, 80, API_URL);
  int httpCode = http.GET();
  if (httpCode == HTTP_CODE_OK) {
    String payload = http.getString();

    // Parse JSON response
    const size_t bufferSize = JSON_OBJECT_SIZE(2) + JSON_OBJECT_SIZE(3) + 160;
    DynamicJsonDocument doc(bufferSize);
    deserializeJson(doc, payload);
    const char* quote = doc["content"];
    const char* author = doc["author"];

    display.clearDisplay();
    display.setCursor(0, 0);
    display.println(quote);

    // Get the size of the printed text
    int x, y; // x and y coordinates of the cursor
    int16_t x1, y1;
    uint16_t w, h;
    display.getTextBounds(quote, x, y, &x1, &y1, &w, &h);

    // Calculate the position of the next line of text
    x = x + w + 5; // add some padding between lines
    y = y + h;


    display.setCursor(x, y);
    display.println("- " + String(author));
    display.display();
    

  } else {
    Serial.println("HTTP request failed");
  }

  // Wait 10 seconds before making the next request
  delay(10000);
}

Conclusion

I was able to get a functioning copy of the core technology of this project in under 2 hours. Granted, I did need some background knowledge in coding to be able to integrate and get required libraries. Keep in mind though that I still got stuck, but ChatGPT was able to guide me through each time.

Image shows a fully assembled version of the system with functioning code generated by ChatGPT.

So far, I am concerned about a few things after using ChatGPT:

Does it actually make things faster?
Blind acceptance is problematic.
I’m not so keen to be reliant on something which exists exclusively online.

Let’s quickly address these three points.

Does it actually make things faster?

I have to question how much faster this actually makes an engineer. In the case of this morning, I think I completed the goal faster than I might have otherwise. However, there is a case to be made that the large amount of output from ChatGPT can easily make an engineer feel like they are making headway, when in fact they are not.

Blind acceptance is problematic (ChatGPT might have your back)

In the article today, ChatGPT offered a few different API’s for generating quotes from the Internet. This is a problem if ChatGPT’s training data happened to include malicious APIs and someone blindly accepted it as “working.” The same concern applies to libraries suggested by ChatGPT.

Reliance on something that exists only online

This concern is probably the least troublesome of all three, as most developers will presumably have an Internet connection. However, if you are relying on ChatGPT for the majority of your work, you may be lost if your Internet connection is compromised. In the future, it would be awesome to have an onsite version of the Large Language Model to ensure engineers have guaranteed access. A model developer firm could license local versions (based on a general set of training data) to individual corporations. The client companies could then continue to train their individually-licensed models on their own internal data. Intellectual property would be safeguarded and models would be more versatile and accurate when responding to user needs.

Wrapping Up

Thank you for reading this week’s post! It was really quite amazing working with ChatGPT. I learned some new and crazy things, like the Python “Easter egg” built-in library called this that contains a poem written by Tim Peters (?!).

As always, please feel free to leave a comment below!

-Travis

My Other Work

ChatGPT in an Embedded World: Quickly Making A Quote DisplayFebruary 27, 2023
ChatGPT: Positive Spin on a SCARY but POWERFUL ToolFebruary 20, 2023
Asyncio in Python: The Super Power No One UsesFebruary 12, 2023
C++ Wrapped in Python: A Daring but Beautiful DuoFebruary 6, 2023
Traceroute in Python: A Powerful Wrapper Guide ExposedJanuary 30, 2023

ChatGPT: Positive Spin on a SCARY but POWERFUL Tool

Image of a robot. Text on the right says, "Things Are Changing. Time to WAKE UP." The robot symbolizes artificial intelligence, since this article is about the newly released ChatGPT from OpenAI.

The technology is out there, available for anyone with an internet connection. Insane. If you have interacted with ChatGPT, it should be immediately clear that this isn’t Alexa, Siri, or ‘OK, Google.’ This is something new that will reshape how humans create. Strap in, embrace it, and avoid the easiest route of existential threat.

We are taking a break from the usual content this week to talk about something that both scares the hell out of me and sparks awe – ChatGPT. The implications of this technology are so general that it is difficult to avoid divergent, runaway speculation. In this article, I focus on the practical implications of a tool like ChatGPT for a software engineer like myself.

With ChatGPT as such an accessible and erudite resource, my immediate reaction is, As a developer, what am I good for now? I pride myself on being able to cleverly solve problems with software. How do I find the joy of helping people if ChatGPT is readily available, free, and often more capable than me at a specific task?

ChatGPT is Here: Get Over It

So…you worry that your precious skill has been disrupted and you have little security in your identity and ability to be helpful. It has happened to many before you and will continue to happen to those who believe they can ride a single skill to the grave. As we are continually reminded, no skill or ability is that special anymore.

“Once a new technology rolls over you, if you’re not part of the steamroller, you’re part of the road.”
Stewart Brand

Time to step up and face what’s coming! Where do we go from here?

Architects from the Start

For every motivated software engineer, there comes a day when they must step away from writing code to begin breaking apart problems and designing solutions for others to implement.

I am in this exact situation now. I love to code, but I need to begin exercising my system design skills. In the early transition to this new technology, this will still include coaching junior engineers in implementing software designs. Eventually, the junior engineer will be replaced by queries to a Large Language Model like ChatGPT.

Fighting your way through difficult code, learning how to deal with the complexity of large projects, and specializing in something specific (e.g., a programming language or a piece of hardware) are all things the software engineer of the future will be liberated from.

Coming out of university, I believe the software engineer of the future will simply be a system architect who knows how to ask a Large Language Model like ChatGPT the right questions to generate the vision they have. They will be able to skip the 3-5 years of code/hardware specialization.

ChatGPT: High-Level System, Details Later

If you want to be successful in the future as a software engineer, you must be creative and think at the system level. Use ChatGPT to design systems that automate meeting basic human needs, such as food, water, and shelter.

Start from the top-down. Is your goal something big or something small? Write it down and start breaking it into sub-problems. Can’t think of how to break it apart? Ask ChatGPT!

I asked ChatGPT, "Say I want to desalinate water to quench the thirst of a small population, what would be a problem set that meets that end goal?" ChatGPT gave me an 8-part list to help solve this problem.

In the example above, I asked ChatGPT to help me develop a problem set to meet the end goal of desalinating water to help a small population. ChatGPT responded with a useful, 8-point list. Now, I have 8 tasks to target for design and improvement. While ChatGPT was able to help with the plan, ChatGPT is not going to perform all 8 of these tasks alone. There is plenty of work left for a human. ChatGPT is not going to help us simply by existing. Human interaction is REQUIRED.

My hope is that this kind of tool is used to meet many human needs through automation, securing our population.

Strong Character Needed

Today, the early stumbling of an inexperienced software engineer has little effect on the large system they belong to. In the future, when a software engineer designs a large-scale and accidentally flawed system with ChatGPT, the consequences will be far more drastic. Future software engineers should be taught and frequently reminded of their responsibility when using this kind of tool.

Careful consideration of each new technology that is developed using this tool is required. We need engineers with discipline and character.

Conclusion

I hope that this article sparks hope in those (like me) who see a tool like ChatGPT and consequently fear that they are no longer useful. Our responsibility has grown greatly and the stakes are high, but with strong character, I think the future is bright for humanity.

On a closing note, I am embracing this technology and I refuse to be steamrolled. Moving forward, my articles will focus on design and architecture. Any code implementation will be carried out by Large Language Models like ChatGPT, then integrated by me to meet the final design requirements.

Please leave a comment if you find this thought-provoking or have anything to add to the headings I have here.

If you are interested in reading about some of my other projects (which were not implemented by AI), please check out the following:

Thanks for reading!

-Travis

Asyncio in Python: The Super Power No One Uses

The image shows three W's superimposed on each other. The text in the image says on the top, "Let's Share What We Know." On the bottom of the image, it says "World Wide Web," with 2 blue and yellow Python cross icons between the words "World Wide" and also between the words "Wide Web." The graphic is here to remind us of the large scale of the Internet. It is composed of over 4 billion IPV4 addresses! In this article, I use the asyncio library to quickly scan all of these addresses with a single process.

This week, we explore the asyncio library and calculate how long it would take for a single process to scan the entire Internet’s worth of IPV4 addresses (…only 4,294,967,296 total addresses!).

In this article, I briefly walk through the asyncio library and create a real-world example!

If you are new to Python or programming in general, I would advise you to learn the basics before approaching asyncio. This is a bit of an advanced topic, so having a solid grasp on the foundations of synchronous programming will be a requirement here.
A [Kind of] Experienced Software Guy

Asyncio

Asyncio was introduced as a built-in library in Python version 3.4. It is intended for IO-bound operations. One example is network communication, as it is often the case that you will wait for data to be returned from a remote server.

In the time spent waiting, however, you may have an opportunity to complete other work. This is where an asynchronous application shines.

You define a co-routine by simply adding the ‘async’ keyword in front of the normal ‘def’ keyword, as shown below:

async def main():

When you want to run this function, it must be done via a runner like this:

asyncio.run(main())

The Mental Model of Asyncio

When a runner is created to execute one or many asynchronous tasks, it starts by creating an event loop.

“The event loop is the core of every asyncio application.”
Python Docs

Below is a very simplistic mental model for a basic asyncio application.

Diagram of a very simplistic mental model for a basic asyncio application.

Each time an asyncio ‘Task’ has to ‘await’ something, there is an opportunity for the other tasks in the loop to execute more of their work if they are no longer ‘awaiting.’

Our mental model looks very similar to a single-threaded synchronous application. However, each of the asynchronous tasks are being handled independently, as long as no single task is hogging attention from the event loop.

Concrete Example in Asyncio

Now we have all that is needed to develop a solution for an important question:

How Long Would it Take?

How long would it take for a single-threaded program to scan the entire Internet’s-worth of IPV4 Addresses?

Let’s start by defining a class that can generate all of the IP addresses.

@dataclass
class IPAddress:
    subnet_ranges: List[int]

Above, we define a data class that simply holds a list of subnet values. If you are familiar with IPV4, you probably guessed that this list will have a length of four. I didn’t go out of my way to enforce the length, but in a production environment it would be ideal to do so.

We now have something that can hold our IP address in a sane way, but we still need to generate them.

@classmethod
def get_all_ips(cls, start: int = 1, stop: int = 255) -> 'IPAddress':
    first = second = third = fourth = start

    while fourth < stop:
        first += 1

        if first >= stop:
            first = start
            second += 1
        if second >= stop:
            second = start
            third += 1
        if third >= stop:
            third = start
            fourth += 1
        curr_ip = cls([fourth, third, second, first])

        if cls.is_valid_ip(curr_ip):
            yield curr_ip

For this, I introduce a factory class method that will yield IP addresses with the default range of ‘1.1.1.1’ to ‘255.255.255.255.’ The method increments the least-significant subnet value and rolls over to the higher-order subnets each time its value reaches 255. The bulleted list below illustrates the method’s address outputs.

1.1.1.1
1.1.1.2
1.1.1.3
…
1.1.1.254
1.1.1.255
1.1.2.1
…
255.255.255.254
255.255.255.255

If you have a keen eye, you will have likely noticed the ‘is_valid_ip’ class method. It’s called just before yielding to the calling function.

This function simply checks if the IP address is in a valid public range as defined by the private ranges. See below:

@classmethod
def is_valid_ip(cls, ip_Address: 'IPAddress') -> bool:
    if ip_Address.subnet_ranges[0] == 0:
        return False
    
    if ip_Address.subnet_ranges[0] == 10:
        return False
    
    if ip_Address.subnet_ranges[0] == 172 and 16 <= ip_Address.subnet_ranges[1] <= 31:
        return False
    
    if ip_Address.subnet_ranges[0] == 192 and ip_Address.subnet_ranges[1] == 168:
        return False
    
    return True

Are We Asynchronous Yet?

No, not yet…but soon! Now that we have our IP address generator defined, we can start building an asynchronous function that will do the following:

Iterate our generator function an N-number of times to get a batch of IPs.
Create an asynchronous task for each IP address in our batch which checks if a port is open.

By adding timers to this code, we will find out how long it would theoretically take! Keep in mind that we already know the performance impacts of using Python vs. C++, but this is a toy problem, so…Python is perfect.

Iterate our Generation Function

for _ in range(IP_COUNT_PER_GATHER):
    try:
        next_group_of_ips.append(ip_addr_iter.__next__())
    except StopIteration:
        complete = True
        break

Above is how we will iterate our IP generator function.

Create an Asyncio Task For Each IP

for i in range(IP_COUNT_PER_GATHER):
           async_tasks.append(asyncio.create_task(check_port(str(next_group_of_ips[i]))))

We create a task for each IP port check and store a reference to the task.

Asyncio: Await Results

With multiple tasks executing at the same time, it doesn’t win much if we have to ‘await’ each individually. To solve this problem, the asyncio library has a function called ‘gather.’ See below for how I used ‘gather’ in this application:

await asyncio.gather(*async_tasks)

for i in range(IP_COUNT_PER_GATHER):
    if async_tasks[i].result():
        ip_addresses_found.put(str(next_group_of_ips[i]))

By ‘awaiting’ the ‘gather’ function, we are actually awaiting all tasks in the list. When all have completed, if tasks returned not None, we add it to our queue of IPs that we may want to process later.

All Together!

The whole function together looks like this:

async def main(ip_addresses_found: Queue, start: int = 1, end: int = 255):
    ip_addr_iter = iter(IPAddress.get_all_ips(start, end))
    complete = False
    
    while not complete:
        next_group_of_ips = []
        async_tasks = []
        
        for _ in range(IP_COUNT_PER_GATHER):
            try:
                next_group_of_ips.append(ip_addr_iter.__next__())
            except StopIteration:
                complete = True
                break
        
        for i in range(IP_COUNT_PER_GATHER):
            async_tasks.append(asyncio.create_task(check_port(str(next_group_of_ips[i]))))
        
        await asyncio.gather(*async_tasks)

        for i in range(IP_COUNT_PER_GATHER):
            if async_tasks[i].result():
                ip_addresses_found.put(str(next_group_of_ips[i]))

Conclusion

The time has come to share my results! These will vary based on Internet latency and machine hardware.

I set my scan to do 10k IPs per batch, with a timeout on the connection of 1 second. This resulted in an average batch runtime of ~1.3 seconds.

I didn’t let it run to see how long it would actually take (running this program had major effects on our ability to use the Internet), but if you divide the total number of possible IPs by our batch size of 10k, you get ~430k batches. At 1.3 seconds per batch, that totals 558,346 seconds, or 6.46 days of constant running.

Not as bad as I originally thought 🙂

Fun fact: I first was introduced to co-routines while programming my Paper Boy game in Unity!

Thanks for reading this week! Please ‘like’ if you enjoyed the content. Feel free to leave any comments or suggestions as well!

-Travis

C++ Wrapped in Python: A Daring but Beautiful Duo

Image depicting a coal miner picking away at a ceiling just below a children's playground. The coal miner represents C++ and the children's playground represents the land of Python.

Have you ever wondered how Python libraries are so fast? Well, it’s because they aren’t written in Python! C and C++ compose the major backbone that make Python so powerful. This week, we will look at how I adopted this idea for the spelling maze generator we built previously!

The week before last week’s post really exemplified why being able to program in C++ gives you unending power in the realm of efficiency gains. Unfortunately, however, most of the world has decided to move on to the cleaner and more abstract sandboxes of Python, mainly to reduce the development cost of building new software. I guess some people just aren’t here for the pain like others…

Anyways, this week’s post tackles how the hardcore C++ programmer can still make friends in the Python playground.

It’s Already Built

Cartoon depiction of a man standing next to a wheel with his arm resting at the peak. There is text in the upper right corner asking "Is This New?" The intention being to ask if we are introducing anything new to C++.

Lucky for us, people share their tunnels to brighter levels of existence — all we have to do is look and follow!

The simplest way to make your C++ program usable in the high-level language of Python is to use the Pybind11 Library.

pybind11 is a lightweight header-only library that exposes C++ types in Python and vice versa, mainly to create Python bindings of existing C++ code.
Pybind11 Frontpage

What does it mean?

We just have to include a header file to be able to bind functions to a module that is importable/callable from a Python script.

Where would this be useful?

I am glad you asked! We already have a project that is a perfect candidate. Do you remember when we produced spelling mazes to help my son study for spelling? Then, we rewrote it in C++ to measure the efficiency gains…which were sizeable.

Well, why not pair the two? Then we can keep our web front-end built in Python and back it up with our C++ library!

Code Time (C++, CMake)

So really, our code is going to be very minimal. All we need to do is:

Create a single function that accepts a word and saves our maze to a file
Bind that function to a Python module using Pybind11’s PYBIND11_MODULE macro

The Function (C++)

void generate_maze(std::string word, int grid_width, int grid_height, std::string file_prefix, int block_width = 20, int block_height = 20){
    generator.seed(time(0));
    WordMaze m(word, grid_width, grid_height, block_width, block_height);
    m.save_to_png(file_prefix + word + ".png");
}

Above, you can see the function we use to create our WordMaze object in C++ and save it off to a file. You might notice we are taking in a few more parameters than we had previously discussed. This is just to make it easier to configure the maze from the Python side.

The Module Binding (C++)

PYBIND11_MODULE(SpellingMaze, m) {
    m.def("generate_maze", &generate_maze, "A function to generate a maze.");
}

Could it really be that simple? Well, yes and no. This is the code that is required, but actually building it is the part that will sneak up on you (especially if you are on a Windows machine).

Building

To preface this section, I will say:

If you are building a first-party app without third party dependencies in C++, building with Pybind11 is going to be really easy (their website does an excellent job of walking you through the many ways to build).
Travis (He’s a NOOB at C++ build systems)

However, we don’t have it that easy, since we need to also link against SFML.

Story Time

So, with my little experience setting up C++ build systems (none), I set out to combine the powers of SFML and Pybind11 to create a super fast high-level Maze Generator!

SFML, Pybind11 and MSYS2 walk into a bar…

Previously, I wrote an article on how to set up a Windows build environment for compiling simple SFML applications. Looking back now, I realize there are definitely better ways to do this.

For this project, I struggled to wrap my head around including Pybind11 in my already convoluted SFML build system. From the examples given on the Pybind11 website, it was clear my CMake approach wasn’t their go-to option for building.

In fact, the CMake example they give requires that the entire Pybind11 repository be included as a submodule to my repository. So, after a few failed attempts at copy-pasting their CMake config, I took a different approach.

Side note: I like using Visual Studio Code. It is by far my favorite IDE with all of its plugins and remote development solutions. However, in this case, the CMake plugin makes things nearly impossible to understand which configuration is being generated.

Eventually, I opted for using MSYS2 directly. I was able to install the Pybind11 and SFML dependencies using the following commands:

pacman -S mingw-w64-x86_64-pybind11
pacman -S mingw-w64-x86_64-sfml

This made finding them as easy as including these lines in my CMakeLists.txt:

find_package(SFML 2.5 COMPONENTS system window graphics network audio REQUIRED)
find_package(pybind11 REQUIRED)

Here is our final CMakeLists.txt file:

cmake_minimum_required(VERSION 3.12 FATAL_ERROR)

project(SpellingMaze VERSION 0.1)

find_package(SFML 2.5 COMPONENTS system window graphics network audio REQUIRED)
find_package(pybind11 REQUIRED)

pybind11_add_module(SpellingMaze src/spelling_maze.cpp)

target_link_libraries(SpellingMaze PRIVATE sfml-graphics)

Conclusion

Just like that, we can bring the power of C++ to the world of Python (even on a Windows machine!). I hope you enjoyed this post! I am always open to feedback. Let me know if there is anything you would like me to dive into, attempt, and/or explain! I am more than willing to do so. Please add your questions and suggestions in the comments below!

Thanks!

-Travis

Traceroute in Python: A Powerful Wrapper Guide Exposed

Image shows penguin wrapped by a Python. This is supposed to convey the main idea of this article, which is to create a Linux program wrapper with the traceroute program to make tracing a network route easier for a Python user. — Part of this image was generated by DALL-E 2

This week, we will create a Linux program wrapper for the ‘traceroute’ program to make tracing a network route easier for a Python user. By the end of this article, I will have demonstrated a few useful techniques in data parsing and object orientation that you might find helpful.

The Target: ‘traceroute,’ Linux, and Python

Here is an abstract view of the entire system:

Image shows entire system from a high-level view. 'Python Business Logic' is at the top. 'Python Linux Layer' is in the middle. The 'Linux' layer is at the bottom. Although not shown in the diagram, the traceroute program resides in the Linux layer.

In the diagram, the ‘Python Linux Layer’ is the thing we want to program. ‘Traceroute’ resides at the lowest ‘Linux’ layer.

We want to wrap the traceroute tool on a Linux system and make it usable in a Python program. ‘Traceroute’ lists the route that a network packet follows in order to retrieve the information you seek from the remote host.

A Plan

We need to be able to:

Call the program from Python and capture its data
Parse the data into an importable and usable object

The first will be relatively easy, but the second is where the work lies. If we do it in a systematic way and keep the goal in mind, however, it shouldn’t be too difficult.

Calling the ‘traceroute’ Program and Storing its Output

First things first, we need to call the program…

import subprocess

subprocess.call(["traceroute", "-4", "periaptapis.net"])

Wow! That was easy!

To explain this a little bit, we are using ‘traceroute’ in the IPV4 mode (the ‘-4’ flag).

Right now, it is printing directly to the console, which is not ideal — we need that data for parsing!

To accomplish this, we will switch the method we call from ‘call’ to ‘run.’ This method has a ‘capture_output’ parameter. According to the documentation, when ‘capture_output’ is set to True, the ‘run’ method will return an object which contains the text of the program’s output in stdout.

import subprocess

ret = subprocess.run(["traceroute", "-4", "periaptapis.net"], capture_output=True)

We can access that captured output like this:

print(ret.stdout.decode())

We use the ‘decode’ method because stdout is in bytes and we need a string (more on that here).

Wow! Only 3 lines of code and we already have the data in a place that is very accessible!

Data Parsing

One of the most useful skills I think I’ve gained from programming in Python is data parsing.

Specifically, as a firmware engineer, I produce a lot of serial log data for the boards that I work on. Without the data parsing techniques I have accumulated, I would be spending the majority of my time scrolling long text files looking for less than 1% of the information they hold. Not just once either, because I (like every other programmer) never get it right on the first try.

Data parsing is also useful elsewhere in website scraping, big data collection, and more. Once you get fast enough, it will be hard not to write a parser for everything.

Taking a Peek at our ‘traceroute’ Output Data

The 3-line program we wrote above produces the following output:

traceroute to periaptapis.net (147.182.254.138), 30 hops max, 60 byte packets
 1  home (192.168.0.1)  0.267 ms  0.618 ms  0.583 ms
 2  boid-dsl-gw13.boid.qwest.net (184.99.64.13)  3.425 ms  3.177 ms  3.032 ms
 3  184-99-65-97.boid.qwest.net (184.99.65.97)  3.143 ms  3.458 ms  3.038 ms
 4  ae27.edge6.Seattle1.Level3.net (4.68.63.133)  13.147 ms  13.216 ms  30.300 ms
 5  * * *
 6  * DIGITAL-OCE.edge9.SanJose1.Level3.net (4.71.117.218)  29.985 ms *
 7  * * *
 8  * * *
 9  * * *
10  147.182.254.138 (147.182.254.138)  31.194 ms  30.308 ms  29.753 ms

Oof, that is going to be a pain to parse out the data we want. Let’s take this apart.

The first line is something we already know – we are tracing the route to ‘periaptapis.net.’ From there, each line represents a station that the packet is routed through. These “stations” are things like routers and gateways. The last line is the host that we wanted to reach.

By default, there are 3 queries per station. This helps the user know a couple of things, like average round-trip time and whether there are other hosts that respond to packets at that station number. Below is the breakdown for one of our lines of output:

Example of traceroute output, broken down by component (station number, hostname and IP address of the station, Query #1 round-trip time, Query #2 round-trip time, and Query #3 round-trip time).

The station number
The hostname and IP address of the station
Query #1 round-trip time
Query #2 round-trip time
Query #3 round-trip time

If there is an asterisk in a field, the station didn’t respond to the packet we sent. Also, there is a chance that we get a different host in any of the three queries for a given station number. This is due to network configurations to handle incoming requests.

Simplifying Output

To simplify the output of the ‘traceroute’ program and make our parsing lives easier, we will introduce the ‘-n’ and ‘-N1’ flags to our ‘traceroute’ program call.

-N1

The ‘-N1’ tells the ‘traceroute’ program to only use one packet at a time.

Sending one packet at a time is just a kindness to the devices we will be tracing so we don’t send a bunch of packets each time we want to trace a route. The downside is that the results will be slower to come in.

-n

The ‘-n’ simply tells traceroute to not print the hostnames, simplifying things further by obtaining only the IP address of the station.

With those two flags added, our call now looks like this:

ret = subprocess.run(["traceroute", "-4", "-N1", "-n", "periaptapis.net"], capture_output=True)

Our call also produces a clean output:

traceroute to periaptapis.net (147.182.254.138), 30 hops max, 60 byte packets
 1  192.168.0.1  0.657 ms
 2  184.99.64.13  9.288 ms
 3  184.99.65.97  3.681 ms
 4  4.68.63.133  14.012 ms
 5  4.69.219.65  29.552 ms
 6  4.71.117.218  30.166 ms
 7  *
 8  *
 9  *
10  147.182.254.138  30.837 ms

This will make our lives much easier!

Coding an Object

Finally we get to the object creation that will be importable and usable by another Python program. We will call this object ‘TraceRouteInstance’ and it will have a single constructor input of ‘hostname_or_ip,’ which we will use in our ‘traceroute’ call.

So far, our object looks like this:

import subprocess

class TraceRouteInstance:
    def __init__(self, hostname_or_ip: str):
        self.traceroute_data = subprocess.run(
            ["traceroute", "-4", "-N1", "-n", hostname_or_ip], 
            capture_output=True
            )

We store our ‘traceroute’ return data in a ‘traceroute_data’ variable for later parsing. To store each of the stations, we will use a named tuple, shown below:

Station = namedtuple('Station', ['ip', 'latency(ms)'])

The syntax might look a little strange, but it enables us to reference the data in the tuple by a name which looks cleaner in the long run.

Parsing

Next we need to parse the output.

One way to do this would be by stripping leading and trailing white space on each line, then checking for a digit at the first character. Stripping the line of white space at the start will ensure that the line has a digit as its first character (if it is a line we care about).

This is how a newcomer might approach the problem.

However, to avoid so much string manipulation, I always suggest using regular expression for data extraction from lines of text. It is very difficult to understand regular expressions in the beginning, but when you understand them, they will save you headaches down the road and immensely reduce the complexity of data parsing.

Enter Regular Expression

Let’s take a look at the regular expression we will be using for this exercise:

"(?P<station_number>\d+)  (?P<ip_address>\d+\.\d+\.\d+\.\d+)  (?P<latency>\d+\.\d+) ms"

In this, we have 3 named groups: ‘station_number,’ ‘ip_address,’ and ‘latency.’ We can use this regular expression to search a line and reference the named groups to extract the data we want.

Our parse function looks like this:

    def parse_data(self):
        station_regex = r"(?P<station_number>\d+)  (?P<ip_address>\d+\.\d+\.\d+\.\d+)  (?P<latency>\d+\.\d+) ms"

        for line in self.traceroute_data.split("\n"):
            re_match = re.search(station_regex, line)

            if re_match:
                ip_address = re_match.group("ip_address")
                latency = float(re_match.group("latency"))

                self.route_list.append(Station(ip_address, latency))
            elif '*' in line:
                self.route_list.append(Station('*', '*'))

We take the output line-by-line, searching the line for our regular expression. If the regular expression is found, we use the named groups to extract the data from the line, adding it to our ‘route_list.’

If we don’t find the regular expression but we see an asterisk, we assume that the station didn’t respond and add a default value to our ‘route_list.’

Final Object Code

Finally, our importable ‘traceroute’ Python wrapper looks like this:

import subprocess
import re
from collections import namedtuple

Station = namedtuple('Station', ['ip', 'latency_ms'])

class TraceRouteInstance:
    def __init__(self, hostname_or_ip: str):
        self.traceroute_data = subprocess.run(
            ["traceroute", "-4", "-N1", "-n", hostname_or_ip], 
            capture_output=True
            ).stdout.decode()
        self.route_list = []

        self.parse_data()
    
    def parse_data(self):
        station_regex = r"(?P<station_number>\d+)  (?P<ip_address>\d+\.\d+\.\d+\.\d+)  (?P<latency>\d+\.\d+) ms"

        for line in self.traceroute_data.split("\n"):
            re_match = re.search(station_regex, line)

            if re_match:
                ip_address = re_match.group("ip_address")
                latency = float(re_match.group("latency"))

                self.route_list.append(Station(ip_address, latency))
            elif '*' in line:
                self.route_list.append(Station('*', '*'))

As a simple test, let’s write a main function in this script that utilizes the above object and prints out the trace list.

if __name__ == "__main__":
    tr = TraceRouteInstance("periaptapis.net")
    print(tr.route_list)

If you run this, it will eventually produce the following output:

[Station(ip='192.168.0.1', latency_ms=0.624), Station(ip='184.99.64.13', latency_ms=3.244), Station(ip='184.99.65.97', latency_ms=3.445), Station(ip='4.68.63.133', latency_ms=13.911), Station(ip='4.69.219.65', latency_ms=29.505), Station(ip='4.7.18.10', latency_ms=30.912), Station(ip='*', latency_ms='*'), Station(ip='*', latency_ms='*'), Station(ip='*', latency_ms='*'), Station(ip='147.182.254.138', latency_ms=31.139)]

This output shows that we have successfully extracted the route that a packet would take from my machine to requesting data from ‘periaptapis.net.’

Conclusion

In this post, we explored wrapping a Linux program with a Python interface to make it more accessible by another Python program. If you run this program, however, you will notice how long an object takes to be created. This is due to the length of time it takes for a route to be traced by using ‘traceroute.’

In a future post, we might explore options on how to mitigate the effects of this load time for external programs by using the ‘asyncio’ built-in Python library or by multi-threading.

I hope you enjoyed this week’s post! Please add any comments, suggestions, or questions that you may have! If you are interested in other articles I have written, check out these links:

Thank you for reading!

-Travis

C++ > Python EXPOSED: Powerful Language Makes Case with Data

Has my ego inflated in the last couple of weeks from programming in C++? Maybe, maybe not… All I know is that C++ is leagues faster than Python. Don’t believe me? Well, I am glad you are here! This week, we will be converting a Python program I wrote and documented here to its C++ counterpart.

At the end of this post, I hope to have concrete evidence of just how much faster C++ can be! Let’s dive in!

Measurement

First, we must take measurements on the portion of code we will be converting. If you haven’t read the original article here, I recommend you give it a peek. Here’s a snippet of how the original program generates mazes:

for i in range(args.num_to_generate):
    maze = WordMaze(args.word, args.grid_width, args.grid_height, args.pixel_width, args.pixel_height, args=args)
    maze.save_image(args.filename.split(".")[0] + "_" + str(i) + ".png")

We will be timing the maze generation and the image storage. To do this, we will utilize the following generator function for a set of input values:

def test_maze_inputs():
    words = ["Hello", "Random", "Testing", "Maze", "Generation"]
    grid_widths = [(20, 20), (40, 40), (80, 80), (160, 160)]

    for word in words:
        for width in grid_widths:
            yield argparse.Namespace(word=word, grid_width=width[0], grid_height=width[1], pixel_width=20, pixel_height=20)

We will be building mazes at the sizes of 20 x 20, 40 x 40, 80 x 80, and 160 x 160. These will be pixel sizes 400 x 400, 800 x 800, 1600 x 1600, and 3200 x 3200, respectively. The values chosen are arbitrary and all runs will be done on the same system.

Below is the timing test run code:

if __name__ == "__main__":
    # args, _ = parseargs()
    maze_generation_times = []
    image_save_times = []
    for test_input in test_maze_inputs():
        print(f"Testing Inputs: {test_input}")

        # Time Maze Generation
        start_time = time.perf_counter()
        maze = WordMaze(test_input.word, test_input.grid_width, test_input.grid_height, test_input.pixel_width, test_input.pixel_height)
        stop_time = time.perf_counter()
        total_time = stop_time - start_time
        maze_generation_times.append(total_time)

        # Time Maze Saving
        start_time = time.perf_counter()
        maze.save_image("maze.png")
        stop_time = time.perf_counter()
        total_time = stop_time - start_time
        image_save_times.append(total_time)
    # Print our table
    print("CSV:\n")
    header_printed = False
    
    for ti, test_input in enumerate(test_maze_inputs()):
        if not header_printed:
            for key in test_input.__dict__:
                print(key, end=",")
            print("Generation Time,Image Save Time,")
            header_printed = True
        
        for key in test_input.__dict__:
            print(test_input.__dict__[key], end=",")
        print(f"{maze_generation_times[ti]},{image_save_times[ti]},")

It will print out a nice CSV-style output for us to curate in Excel:

All right, we have our times to beat! Let’s convert some code!

C++ FTW… Maybe

To make this as fair as possible, I will do my best not to adjust the design of the program. I will simply translate it to C++ and run the same benchmarks.

One does not simply translate from Python to C++
Basically Anyone (except Travis)

Here I am a week later, literally. I have a functioning C++ program that generates the same output you would expect to see from the Python script in my previous post.

Oh boy, it even looks like it has a memory leak! Let’s go ahead and try to fix that up really quickly….

DONE! My brain is fried. I am going to simply let the results speak for themselves:

Python vs C++ Results

Python and C++ clearly diverge even at the first data point. On the bar graph, you can see the stark differences by maze size. For the largest maze size in this test (3200 x 3200 pixels), Python was about 4 times slower than C++!

On the scatter plot, you can see the time differences plotted out by actual square pixels. The linear fits of the data are pretty good, with R-squared values slightly exceeding 0.99. The linear fit equations could be useful for estimating time values for other maze sizes that we did not test.

The desired maze size (in square pixels) would be substituted for the ‘x,’ and solving for the ‘y’ would result in a time estimate. You can see that the slope (time divided by square pixels) of the Python algorithm is steeper than C++, further showcasing C++’s efficiency.

Conclusion

This was an excellent learning experience for me. In the process of rewriting my code, I got a close-up look at various inconsistencies and poor algorithmic choices that came along with my first implementation.

I have yet to fix these things, as it was important to me to have an apples-to-apples comparison between the two codebases. If you compare the GitHub repositories (Python vs C++), you will find there are very few design differences between the two.

Thank you for reading this week!

-Travis

Neural Networks Exposed: Beautiful Genetic Algorithms in C++

This week, we are generating feed-forward neural networks which learn through a genetic algorithm. These networks are much easier to understand than their back-propagation counterparts. Please join me on this step-by-step guide for simplistic neural networks as they learn through generations of selection.

Neural Networks

Neural networks can be a difficult subject to dive into during ground-up implementation. Here, I will explain a simple ‘feed-forward’ network, as we do not need back-propagation for this article. Fortunately, back-propagation is the only part of a neural net which requires deep knowledge of calculus.

So, if you fear feeling dumb like me, fret not! Multiplication is all we need!

I chose evolution because it requires less math 😉
Not Charles Darwin

Feed-Forward Neural Networks

A feed-forward neural network only feeds data in one direction across the network.

The above diagram shows a neural network with 2 inputs, 4 hidden layer neurons, and 2 output neurons. The data flows from left to right. At each step, the data is modified in a parallel fashion by each neuron.

A neuron simply takes in all of the given inputs, multiplies those inputs by a paired weight stored by the neuron, and sums these products together. The sum of that addition is passed through a sigmoid function, which in turn is the neuron’s output.

Looking back at our neural network diagram, the hidden layer neurons each have 2 inputs. So, each neuron will store 2 weights. Therefore, the hidden layer stores 8 weights in total.

Neural Networks and Genetic Algorithms

The basic rule which governs my algorithm is:

At the start of each generation, select the highest performing agents from the previous generation and combine their neural networks to form the next generation’s agents.

The important factors are 1) how you select the top performers, and 2) how the next generation is stitched together from the previous winners.

Code Time

The code base for this project has a lot of additional functionality that is necessary to facilitate agents’ learning. However, to keep this post short and sweet, I am going to focus on the key areas that make this process work.

The Problem to Solve

The problem which our agents must solve is simple. They must get as close to the goal as possible, given the number of ‘ticks’ in a generation.

Overview

A ‘tick’ is a time step allotted to each agent in which they can choose which direction to move and how far that move will be.

When all ticks have been counted, the program orders the agents by least to greatest distance from the goal. Next, it selects a few agents from the top (closest to goal) to seed our next generation. This process continues until we reach the last configured generation.

Main Loop

    for (int generation = 0; generation < NUM_GEN; generation++)
    {
        // Setup our generations Agents
        if (!closest)
		// Setup our initial generation of agents
        else ...
		// Stitch our next set of agents from those that came closest from the previous generation

        // Run our simulation
        run_sim(*agents, goal);

        // Rank our agents and take the configured number of top performers
        closest = get_closest_agents(*agents, goal, generation, NUM_AGENTS_SELECTED_EACH_GENERATION);

        if (PRINT_GENERATION_PERFORMANCE)
            std::cout << closest->at(0)->distance << "," << mutation_chance << std::endl;
    }

Our main loop is simple:

Set up the generation’s agents.
Run the simulation.
Rank the agents by final distance to the goal.

setting up the neural networks

The first generation is a set of agents. Each agent has a neural network that is initialized with random weight values and contains the following elements:

2 inputs (distance to goal in x- and y-directions)
10 hidden neurons
4 outputs

Two outputs represent the x- and y-distances that the agent wants to move, each between 0-1. The other two outputs dictate whether the direction of movement is positive or negative.

The following code implements agent creation:

for (int agent_i = 0; agent_i < NUM_AGENTS_PER_GEN; agent_i++)
    {

        // Check if we are basing our agents on anything
        if (based_on)
        {
            int choice1 = get_rand_int(0, based_on->size() - 1), choice2 = get_rand_int(0, based_on->size() - 1);
            while (choice2 == choice1)
                choice2 = get_rand_int(0, based_on->size() - 1);
            // Using default mutation delta
            agents->push_back(new Agent(new Position(start_pos->x, start_pos->y), based_on->at(choice1)->agent, based_on->at(choice2)->agent, mt, mutation_chance));
        }
        else
        {
            // Create agents with two sensors, distance from goal x and y
            agents->push_back(new Agent(new Position(start_pos->x, start_pos->y), 2));
        }
    }

Simulation Run

The simulation runs for the number of ticks configured. At each tick, it “asks” the agent for a desired distance and direction, based on the agent’s current distance from the goal.

void run_sim(std::vector<Agent *> agents, Position *goal)
{
    for (int tick = 0; tick < NUM_TICKS_PER_GEN; tick++)
    {
        for (Agent *a : agents)
        {
            // Sense the Agents distance from the goal
            float sensors[2] = {
                (a->positions.back()->x - goal->x) / BOUNDARY_EDGE_LENGTH,
                (a->positions.back()->y - goal->y) / BOUNDARY_EDGE_LENGTH};
            // Ask the Agent what it wants to do
            a->move(sensors);
        }
    }
}

A Few More Specifics

Selecting top performers

It is important to note that the way you select your ‘winners’ greatly determines the behavior of your agents. My program selects winners in the following code:

    std::vector<AgentDistancePair *> *agent_distance_pairs = new std::vector<AgentDistancePair *>;

    for (Agent *a : agents)
        agent_distance_pairs->push_back(new AgentDistancePair(a, get_distance(*(a->positions.back()), *goal)));

The algorithm pairs each agent’s pointer to its respective distance from the goal. This pairing is encapsulated in the ‘AgentDistancePair’ object. Then, the program sorts the vector filled with these pairings using the following function:

bool compare_agent_distance_pair(AgentDistancePair *first, AgentDistancePair *second)
{
    return bool(first->distance < second->distance);
}

Finally, we reduce the number in the list to the number specified by the caller. In other words, how many of the top performers do we want? At the same time, we are being careful to manage our memory properly.

while (agent_distance_pairs->size() > num_to_find)
    {
        delete agent_distance_pairs->back();
        agent_distance_pairs->pop_back();
    }

All together, the whole function looks like this:

std::vector<AgentDistancePair *> *get_closest_agents(std::vector<Agent *> agents, Position *goal, int generation_number, int num_to_find = 2)
{
    std::vector<AgentDistancePair *> *agent_distance_pairs = new std::vector<AgentDistancePair *>;

    for (Agent *a : agents)
        agent_distance_pairs->push_back(new AgentDistancePair(a, get_distance(*(a->positions.back()), *goal)));

    sort(agent_distance_pairs->begin(), agent_distance_pairs->end(), compare_agent_distance_pair);

    if (DRAW_GENERATION_PERFORMANCE && (generation_number % DRAW_EVERY_NTH_GENERATION) == 0 && DRAW_FULL_POPULATION)
        draw_agent_path(agent_distance_pairs, goal, generation_number);

    while (agent_distance_pairs->size() > num_to_find)
    {
        delete agent_distance_pairs->back();
        agent_distance_pairs->pop_back();
    }

    if (DRAW_GENERATION_PERFORMANCE && (generation_number % DRAW_EVERY_NTH_GENERATION) == 0 && !DRAW_FULL_POPULATION)
        draw_agent_path(agent_distance_pairs, goal, generation_number);

    return agent_distance_pairs;
}

Stitching together agents of the past

In a genetic algorithm, the neural networks learn by being stitched with other top performers, sprinkled with a bit of randomness. In order to stitch the weights together, we choose 1 of 3 different methods:

EveryOther
SingleSplit
RandomChoice

EveryOther

‘EveryOther’ method selects each weight of the child agent by alternating evenly between taking parent weights, as shown in the diagram below:

SingleSplit

‘SingleSplit’ method uses a randomly selected weight index as a point where the responsibility for providing weights to the child switches to the other parent. You can see an example of this happening at the ‘Split Line’ in the diagram below:

RandomChoice

‘RandomChoice’ method randomly selects weights from both parents as it builds the child weights. No diagram needed ;).

Conclusion

This is Generation #5 of a ‘SingleSplit’ method simulation. The green dot is the selected agent of the generation and the yellow dot is its goal!

This has been a huge learning experience for me! I will definitely be adding tiny brains to many things in the future.

If you are interested in exploring or cloning this code, you can access it on my Github here. Thank you for reading! I hope you found this helpful and thought-provoking! As always, feel free to leave a comment or suggestion below.

Further reading on my work in simulations is available at these links:

SFML And C++ Empower The Life Of A Virtual Forest

Trees grow in an SFML-rendered landscape.

This week, I am exploring a simple application that simulates tree propagation using SFML and C++. Join me as I build a 2-D forest with a keyboard!

This is the first of a series of posts I am releasing about software simulations in C++ using SFML. My work on simulations is an ever-growing exploration and has only recently taken a turn into the land of C++.

The Basics (Dynamic SFML graphics)

I have a background in C, but up until now, my acumen in C’s object-oriented relative (C++) has been limited to “Hello World.” So, I was quite nervous to translate my usual techniques for Object Oriented Programming (OOP) in Python to my new adventures in C++.

I started off by setting up my environment to be able to compile and link against the SFML library (you can read more details here).

With the environment ready, I worked on drawing to the screen. I created a program which randomly spawns yellow dots (“people”) with an area that slowly spawns green dots (“forest”):

SFML graphics in C++ depicting generation of people (yellow dots) and trees (green dots)

This is a pretty basic start, but I felt good about being able to draw something. Each of these dots represents an object, and some objects inherit from a parent. Thus, I developed basic building blocks for a more complex simulation.

Problem

There was only one problem with the above simulation: Let’s just say…it really leaves a lot to the imagination. Actually, maybe that’s what this should be about…exercising your imagination. It sure would make a programmer’s job a lot easier…

Grids… Structure… Yeah!

I created a simple 2-D array of pointers of type ‘GridCell’. To make things easy, any object that exists in the grid must inherit from it.

Simulation Objects (with Delegated SFML Drawing)

There are three simulation objects:

SeedCell (Introduced in tag v0.2)
LeafCell (Introduced in tag v0.1)
TrunkCell (Introduced in tag v0.1)

Progressive Overview

Everything starts from a seed. The SeedCell has a chance of “becoming” a TrunkCell. A TrunkCell, once fully matured, will grow LeafCells in its four compass directions.

I decided to allow the LeafCells to propagate a little further than a single grid block away from the TrunkCell before dropping more SeedCells.

GridCell

Remember, everything has to inherit from the GridCell class if it wants to exist in our grid.

class GridCell : public Cell{
    protected:
    sf::RectangleShape shape;
    uint32_t tick_born;
    int grid_height, grid_width;
    public:
    GridCell(sf::Vector2f pos, sf::RenderWindow *window, uint32_t tick_born): Cell(pos, window), grid_height(config::grid_edge_size), grid_width(config::grid_edge_size), tick_born(tick_born){}

    sf::Vector2f get_pos_in_direction(Directions in_direction){
        if (in_direction == all){
            return sf::Vector2f(-1,-1);
        }

        if (in_direction == north){
            return sf::Vector2f(pos.x, (pos.y > 0) ? (pos.y - 1) : grid_height - 1);
        }
        if (in_direction == east){
            return sf::Vector2f((pos.x + 1) < grid_width ? pos.x + 1 : 0, pos.y);
        }
        if (in_direction == south){
            return sf::Vector2f(pos.x, (pos.y + 1) < grid_height ? pos.y + 1: 0);
        }
        if (in_direction == west){
            return sf::Vector2f((pos.x > 0) ? (pos.x - 1) : grid_width - 1, pos.y);
        }
    }
    virtual void step(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        // Take actions here
    }
    virtual void draw(){
        // Draw Here
    }
};

You might notice this also inherits from a Cell class. This simple class is actually an artifact of an earlier experiment. Eventually, I will find time to join these objects together again, but at the time of this publishing they are still [sadly] separate.

In the chunk of code above, notice the step function’s signature. When the world ticks forward in time, it hands all game objects a pointer to the pointer of the object that may (or may not) exist in all of its immediate compass neighbors.

As we will see in the TrunkCell, this is useful for the technique I used to propagate objects on the grid.

SeedCell

This simple little object brings so much more if it is chosen to sprout! Below is all it takes to be a ‘seed’ in the simulation.

class SeedCell: public GridCell{
    public:
    using GridCell::GridCell;
};

This is the only object that is manipulated by the grid. It must be controlled this way because of the inheritance structure.

TrunkCell

If the SeedCell ‘transforms,’ it becomes a TrunkCell which matures over time. By ‘transform,’ I mean that if the grid randomly generates a boolean value of True, then the program deletes the ‘seed’ object and replaces it with a TrunkCell.

class TrunkCell: public GridCell{
    float density = 0;
    float current_maturity = 0;
    float total_time_alive = 0;
    int maturity_time = 10;
    int current_season = 0;
    int past_season = 0;
    int season_allowable_new_growth = 1;
    public:
    using GridCell::GridCell;
    void draw(){
        float x_offset = (area.x - (area.x * density)) / 2;
        float y_offset = (area.y - (area.y * density)) / 2;


        shape.setPosition(sf::Vector2f((pos.x * area.x) + x_offset, (pos.y * area.y) + y_offset));
        shape.setSize(area * density);
        shape.setFillColor(sf::Color(0x96, 0x4B, 0));
        window->draw(shape);
    }
    void step(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        if (density >= 1){
            if(!*N){
                *N = new LeafCell(get_pos_in_direction(north), window, 3, &season_allowable_new_growth, current_tick);
            }
            if(!*E){
                *E = new LeafCell(get_pos_in_direction(east), window, 3, &season_allowable_new_growth, current_tick);
            }
            if(!*S){
                *S = new LeafCell(get_pos_in_direction(south), window, 3, &season_allowable_new_growth, current_tick);
            }
            if(!*W){
                *W = new LeafCell(get_pos_in_direction(west), window, 3, &season_allowable_new_growth, current_tick);
            }
        }
        if (density < 1){
            density = ((float) current_tick - tick_born) / maturity_time;
        }else{
            density = 1;
        }

        current_season = (current_tick - tick_born) / maturity_time;

        if (current_season != past_season){
            season_allowable_new_growth = 1;
            past_season = current_season;
        }
    }
};

When the TrunkCell reaches full maturity, it spawns LeafCells at each compass direction around itself, if the positions in the grid are empty. To do this, it sets the pointer reference to a new object that is spawned into the grid. This only works because of the assumption that the propagation is happening in a single execution thread.

LeafCell

The LeafCells have a propagation-level counter and an integer pointer assigned by the originating TrunkCell. The propagation-level counter tracks the distance from the originating TrunkCell.

class LeafCell: public GridCell{
    float density = 0;

    int current_life_cycle = 0;

    int total_life_cycle = 10;
    int ripe_time = 2;
    int old_time = 6;
    int *new_growth_count;
    int propagation_level;

    sf::Color fill;
    sf::Color ripe;
    sf::Color old;
    public:
    LeafCell(sf::Vector2f pos, sf::RenderWindow *window, int propogation_level, int *new_growth_count, uint32_t tick_born) : GridCell(pos, window, tick_born), ripe(0, 255, 0), old(218, 165, 32), propagation_level(propogation_level), new_growth_count(new_growth_count){
        shape.setPosition(sf::Vector2f(pos.x * area.x, pos.y * area.y));
        shape.setSize(area);
    }
    void draw(){
        shape.setFillColor(fill);
        window->draw(shape);
    }
    void propagate_leaves(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        if((*N) && (*E) && (*S) && (*W)) return;
        if (propagation_level <= 0) return;

        bool prop_n = get_rand_bool(0.25);
        bool prop_e = get_rand_bool(0.25);
        bool prop_s = get_rand_bool(0.25);
        bool prop_w = get_rand_bool(0.25);

        if(!(*N) && *new_growth_count > 0 && prop_n) {
            *N = new LeafCell(get_pos_in_direction(north), window, propagation_level - 1, new_growth_count, current_tick);
            (*new_growth_count)--;
        }
        if(!(*E) && *new_growth_count > 0 && prop_e) {
            *E = new LeafCell(get_pos_in_direction(east), window, propagation_level - 1, new_growth_count, current_tick);
            (*new_growth_count)--;
        }
        if(!(*S) && *new_growth_count > 0 && prop_s){
            *S = new LeafCell(get_pos_in_direction(south), window, propagation_level - 1, new_growth_count, current_tick);
            (*new_growth_count)--;
        } 
        if(!(*W) && *new_growth_count > 0 && prop_w) {
            *W = new LeafCell(get_pos_in_direction(west), window, propagation_level - 1, new_growth_count, current_tick);
            (*new_growth_count)--;
        }
    }

    void propagate_seed(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        if((*N) && (*E) && (*S) && (*W)) return;
        if (propagation_level > 0) return;

        bool prop_n = get_rand_bool(0.01);
        bool prop_e = get_rand_bool(0.01);
        bool prop_s = get_rand_bool(0.01);
        bool prop_w = get_rand_bool(0.01);

        if(!(*N) && *new_growth_count > 0 && prop_n) {
            *N = new SeedCell(get_pos_in_direction(north), window, current_tick);
            (*new_growth_count)--;
        }
        if(!(*E) && *new_growth_count > 0 && prop_e) {
            *E = new SeedCell(get_pos_in_direction(north), window, current_tick);
            (*new_growth_count)--;
        }
        if(!(*S) && *new_growth_count > 0 && prop_s){
            *S = new SeedCell(get_pos_in_direction(north), window, current_tick);
            (*new_growth_count)--;
        } 
        if(!(*W) && *new_growth_count > 0 && prop_w) {
            *W = new SeedCell(get_pos_in_direction(north), window, current_tick);
            (*new_growth_count)--;
        }
    }

    void step(GridCell **N, GridCell **E, GridCell **S, GridCell **W, int current_tick){
        current_life_cycle = (current_tick - tick_born) % total_life_cycle;

        if(current_life_cycle <= ripe_time){
            if (current_life_cycle == 0 && (current_tick - tick_born) > 0){
                propagate_leaves(N, E, S, W, current_tick);
                propagate_seed(N, E, S, W, current_tick);
            }
            density = current_life_cycle / (float) ripe_time;
            fill = ripe;
        }else if (current_life_cycle > ripe_time && current_life_cycle <= old_time){
            int total_steps = old_time - ripe_time;
            int current_step = current_life_cycle - ripe_time;

            fill.r = ripe.r + (current_step * ((old.r - ripe.r) / total_steps));
            fill.g = ripe.g + (current_step * ((old.g - ripe.g) / total_steps));
            fill.b = ripe.b + (current_step * ((old.b - ripe.b) / total_steps));
        }else {
            fill = old;
            density = ((float) total_life_cycle - current_life_cycle) / (total_life_cycle - old_time);
        }
        
        if(density > 1){
            density = 1;
        }
        fill.a = 255 * density;
    }
};

As a secondary constraint, there is a ‘season_allowable_new_growth’ integer pointer to slow the propagation of LeafCells. This pointer is controlled by the TrunkCell to allow only a certain number of LeafCells to be spawned each season.

Once the propagation-level value reaches zero, a LeafCell will randomly drop SeedCells in the empty compass directions around itself.

Current State

You can check out the simulation’s behavior in its current form in the following YouTube demo:

The repository is currently at tag v0.3 at the time of publishing. If you want to recreate this for yourself, feel free to clone and check out that tag!

Thank you for reading 🙂 If you have any feedback or suggestions, please comment below!

SFML, C++, and Windows: Quick Guide to Awesome Graphics

This week, I am documenting how I set up an environment for developing a graphical program using SFML in C++. I am just starting out myself, so this solution isn’t guaranteed to be the optimal solution. That being said, I had a heck of a time finding a place that went step by step through a Windows setup! So, here it is!

Tools

Simple Fast Multimedia Library (SFML)

The name, Simple Fast Multimedia Library (“SFML”), really says it all. SFML is an open source library that can be compiled on many different platforms. Its goal is to provide an interface to various components of your PC in an easy-to-access way. The modules included are:

System
Window
Graphics
Audio
Network

What each of these do and how to use them is outside the scope of this article, but for more information I encourage you to explore their learning area.

CMake

“CMake is used to control the software compilation process using simple platform and compiler independent configuration files” (source: CMake site). We will be using CMake to help us generate a makefile to build our application. There are many configuration options through CMake that I have yet to learn. For this article, I used only what it took to build my SFML-linked software.

Environment Setup

Let’s get started actually setting up our environment. Now, I would like to preface all of the following with “I am sorry, but we HAVE to extend the PATH variable for this to work.”

Requirements (Downloads)

Please follow the links below to download the required software.

SFML Library
64-Bit MinGW Compiler
- This is a version-specific compilation. The link is for a SourceForge download (I wish there were a better way).
64-Bit CMake for Windows
7Zip (or some other software to extract .7z files)

Configuring SFML

Extract the SFML zip file to your root drive (commonly ‘C:/’). This will result in ‘SFML-2.5.1’ at your root. Rename this folder to ‘SFML.’ Now there will be an ‘SFML’ directory at your system root. The contents of this directory should include other directories such as ‘bin,’ ‘doc,’ and others.

Configuring MinGW Compiler

Unzip the .7z file to the root drive (commonly ‘C:/’). This should result in a ‘ming64w’ directory in your root drive that contains sub-directories such as ‘bin,’ ‘opt,’ and others.

Configuring CMake

Go ahead and install CMake. Pay close attention for the option to include it in your PATH variable. CMake will not work properly if it isn’t included in this environment variable. To check, simply run ‘cmake’ in a command prompt to see if the program runs.

Configuring the PATH Environment Variable

The final configuration to complete before we get to our project building is the PATH environment variable. I tried many different solutions to avoid this step. I don’t like having to set something globally on my machine for a project that may be ephemeral and local to one directory.

Start by opening your start menu and searching for ‘env.’ This should show a suggested program that looks like this:

Open that program and select ‘Environment Variables’ as shown below:

In the window that pops up, we need to select the path variable and edit it:

We need to add the SFML library location and the MinGW compiler location to the PATH variable in order for CMake to find it and configure our Makefile.

A Sample Project with CMake and SFML (Finally!)

All right, with all the configuration out of the way, we should be able to configure a project to compile and link against the SFML library. To do this, create a directory for your project. Create two files, one named ‘Cmakelists.txt’ and the other named ‘test.cpp.’ Also, create a directory named ‘build.’ Now your project directory should look like:

build (directory)
Cmakelists.txt
test.cpp

‘Cmakelists.txt’ for SFML Graphics

In the ‘Cmakelists.txt’ file we will put:

cmake_minimum_required(VERSION 3.12 FATAL_ERROR)

project(TestProject VERSION 0.1)

find_package(SFML 2.5 
  COMPONENTS 
    system window graphics network audio REQUIRED)

add_executable(TestProject test.cpp)
target_link_libraries(TestProject sfml-graphics)

This declares your project name (‘TestProject’) and its version (‘VERSION 0.1’). Next, it tells CMake to find the SFML package (which it will find if the PATH variable is set up correctly). Finally, we add an executable to compile our source file and link that executable with ‘sfml-graphics.’

Testing SFML Linkage

This file (‘test.cpp’) will be used to hold a very simple program that just creates a blank window. The blank window is evidence that our program links correctly with the SFML library.

#include <SFML/Graphics.hpp>

int main()
{
    // create the window
    sf::RenderWindow window(sf::VideoMode(800, 600), "My window");

    // run the program as long as the window is open
    while (window.isOpen())
    {
        // check all the window's events that were triggered since the last iteration of the loop
        sf::Event event;
        while (window.pollEvent(event))
        {
            // "close requested" event: we close the window
            if (event.type == sf::Event::Closed)
                window.close();
        }

        // clear the window with black color
        window.clear(sf::Color::Black);

        // draw everything here...
        // window.draw(...);

        // end the current frame
        window.display();
    }

    return 0;
}

Compilation Time!

Finally, the time has arrived for us to compile our program. Open a command prompt in the build directory of your project and type the following:

cmake -G "MinGW Makefiles" ..

If this successfully executes, then you now have a makefile prepared for compilation. To compile your program, simply type:

mingw32-make

Now run the program by typing:

TestProject.exe

You should have a very plain window that pops up like this:

This blank window is evidence that our program links correctly with the SFML library.

Closing Remarks

I know this week’s post didn’t result in any super awesome GIFs. At least for myself, however, having a documented means of setting up a development environment is super useful for future use and projects. Spoiler alert: I plan to use this exact setup for next week’s project!

Thank you for joining me in this post. I hope it can help someone get a graphics project off the ground!

If you are looking for inspiration on what to use this for, might I suggest building a simulation? Check out some of my attempts in Python with the Kivy Library:

Thank you for reading! Feel free to leave a comment below!

Informative Simulation is More Politics than Money

Kivy plots for a simulation of populations, moving trends, and financial flows.

This week, I wanted to explore a non-zero-sum, agent-based simulation with political platforms and town populations. Given a set of towns with unique political platforms and individuals with political preferences who vote and can move, do we ever see a situation where all individuals are happy?

This is a highly simplistic model that does not conclusively show that free markets are non-zero-sum. Also, I’m definitely not a economics major!
Travis

Hurry! My Time is Limited!

If you are in a hurry, here’s the break-down:

Metric choice matters when inspecting a market to find causation.
Money is a ‘grease’ for personal-value-based choice.
I still love simulations!!!

Metric Choice Matters in the Simulation

While building this simulation, I had the opportunity to install metric collection for a variety of components involved in town voting, market transactions, and population movement. This was required for some debugging to understand why things might not be going as expected. In some cases, it was because my intuition was off. Other times, it was a legitimate bug in my code.

There are definitely people left ‘unhappy’ in this model. Optimization occurs when everyone settles where they are happiest (or the least unhappy).

In this case, each person ends up in a place that for them isn’t worth leave, but they may still be unhappy with their town’s politics. In other words, if there were a town with politics which aligned better with their values, they would move but no such town exists.

Money is the Grease

For me, this project really solidified a different thought process regarding money. I have always seen money as something I must attain from a corporation in order to purchase goods for myself and my family. Building this model has shown me that there is another way of viewing money. Values come first and money is simply a grease to help you attain those values.

Watching this message unfold in the simulation, I realized that I sometimes lack true value. The space is filled with meaningless material purchases. Given a strong enough set of values, you can produce goods which others value. Then, you can’t help but attain money, which incidentally helps you pursue the principles you cherish most.

This thought introduces a question for me. What happens when your values tightly align with the values of the people around you? What can be achieved in such an environment?

I Still Love Writing Simulations

I will be honest…most of the programs I write are not contrived at the outset. In fact, I often almost quit working on them out of embarrassment that there isn’t a clear reason to build. The question, ‘What are you going to get from this?’ creeps in every time. However, after building it, I always have new ‘A-ha!’ moments that feel inherently meaningful. This time was no exception.

Enough of that though! Let’s dig into the specifics of this simulation!

Person/Town Breakdown

Person in the Simulation

Political Preference

In our model, a person has a political ‘base’ preference represented by a binary number. Each bit is a political issue that they either agree with (1) or don’t agree with (0). In addition to this ‘base’ preference, each person is assigned a random ‘Hardheadedness’ value. ‘Hardheadedness’ represents the total number of political preference bits that a person must match with a town’s political preference bits before they cross the happy-unhappy threshold. These ‘Hard Preference Bits’ are selected and set at random during initialization.

In the example above, you can see a person with a ‘Hardheadedness’ score of 3. You can also see their ‘Base Preference Bits,’ and the randomly-selected ‘Hard Preference Bits.’

Happiness Calculation

A person’s happiness is calculated by comparing the political climate (the town’s current political platform), bit by bit, to the person’s base political preference (‘Base Preference Bits’).

If one bit of the town’s political platform aligns with a person’s base preference bit, that 1:1 match adds one to their happiness score. On the other hand, if the town’s political platform bit does not align with the person’s base preference bit, it will only subtract from happiness if this same bit is True (1) for one of their ‘Hard Preference Bits.’ If ‘Hard Preference Bit’ is False (0) for this platform bit, the individual does not care about the mismatch. Let’s look at our example again:

Example of person and person's traits in simulation

Those ‘Hard Preference Bits’ are reflected here in the ‘Base Preference Bits’ as the underlined bits. The underlined bits are the only bits that can affect the happiness score in a negative way if they don’t align with their town’s political platform.

Happiness is capped at the total number of bits in the platform. In this case, a max happiness would be 8. If a person is basically happy, their happiness starts at the platform bit count. Otherwise, it starts at 0.

Deciding to move

A comparison between a person’s happiness and their randomly assigned happiness threshold decides whether they move or stay. The program randomly generates this value between 0.4 and 0.6. Next, the program compares a happiness decimal to this random threshold. The happiness decimal is the happiness score (a sum of 1-8, described in the “Happiness Calculation” section) divided by the number of total happiness bits (which is currently 8 total bits). This division produces a decimal between 0-1. If our happiness decimal is greater than the random threshold between 0.4 and 0.6, the person stays in their current town. Otherwise, if the decimal is less than the threshold, then the person wants to move.

Let’s say the person desires to move. Now they must visit all other towns to see which make them happiest. In some cases, even if they wish to move, they are still happiest where they are. So, they stay!

Remember, just because the person wants to move doesn’t mean they can. A secondary barrier becomes important: Money! There is a cost to moving. A person can only move if they can pay the price. We will explore this later in the explanation of the Town object.

Town in the Simulation

High-Level

A town is responsible for:

Conducting votes on the current political platform.
Tracking the population’s seniority hierarchy.
Keeping the town bank, taxing, and paying people.

Conducting Votes

When the program conducts a vote, it iterates all persons in the towns population to add or subtract a value for each bit in the political platform (see the explanation of a person’s political preference in the sections entitled “Political Preference” and “Happiness Calculation”). The majority vote count result for a political platform bit (positive or negative) sets the same bit in the town’s platform to 0 or 1.

Tracking Seniority

At instantiation, the original persons in a population of a town receive randomly assigned seniority values between 0 and 1. This value represents the amount of time a person has spent in a specific town. For each step of the world, a person’s seniority factors into their wealth. As time passes, the seniority value increases slowly.

Keeping the bank

A town starts off with a specific amount of money in its bank. The bank slowly distributes money to its population based on seniority. A town also uses taxes and utility costs to keep money liquid and cyclic.

The program calculates taxes by multiplying a person’s annual income times the town’s income tax rate (a flat tax). It likewise calculates utility costs by finding the product of the average wealth of the town and a flat tax rate, then by multiplying this product by the individual’s seniority score. The idea is that if you lived here longer, you should pay more. Perhaps I should invert this idea…I’m still not quite sure (let me know your thoughts in the comments below if you think this is fair/unfair!).

The World

The ‘world’ is at the highest level. It is the overarching scheduler for towns and people. Its two chief responsibilities are:

Managing the market for moving (including money transfers).
Directing town voting cadence.

Managing a Market in the Simulation

A supply- and demand-model tracks the market for moving. The calculation for demand takes the number of people who want to move to the town and subtracts that value from the number of people who want to move out. The program divides this result by the total number of people moving to obtain a ‘cost adjustment’ value from a ‘base moving cost.’ This model considers supply infinite, as the towns do not have a population cap.

Voting Cadence

Every town votes at the same periodic time. The program checks this against each step of the world. The newly voted-in platform may cause some persons to move. This new paradigm may also make people who just moved into a new town unhappy.

Dynamic Interface Orientation

Plot 1 – Each town’s current population
Plot 2 – Total persons moved at each time step
Plot 3 – Total persons desiring to move
Plot 4 – Total wealth of each town
Plot 5 – Current cost to move to each town (Note: 50 = the ‘floor’ to this cost)

Observations

Here, there are two towns that are valued higher than the other towns at the start. In other words, more people want to move in than want to move out.
When people start moving, the populations of their former towns drop and the population of their new town increases.
Here, wealth is tightly coupled with incoming populations as they bring money with them.
Finally, there is a steady decline of people wanting to move when they can either afford to move or demand falls to a level where the cost is affordable.

Conclusion

It was super fun to write this simulation! This exercise developed into something more than just a technical exercise. I learned some economic and behavioral concepts regarding personal values and monetary impact.

I hope you enjoyed this week’s post. Thank you for reading it! Feel free to leave a comment correcting me. As mentioned earlier, I am not an economist :). If you want to see the code that drove this post, you can find it here on my GitHub (warning, it’s a bit messy).

If you’d like to check out some of my previous work on simulations, then please go to these posts:

To check out the Python GUI platform I used for the plots in this article, please check out Kivy here.

Merry Christmas and Happy New Year!