AI: The rise of the machines (sort of)

September 21, 2023

I’ve given myself over totally to AI. For two weeks(ish), I’m sacrificing my years of experience on the altar of Artificial Intelligence – I will at every opportunity ask Chat GPT (3.5 in my case) and GitHub CoPilot to guide my every decision.

I’ve been using CoPilot for about 6 months now and I’ve loved the almost psychic way it code completes, so I’m not totally green. What I wanted to do was see how AI fared using a more prompt-driven approach.

Day one

I was working on some blog posts ironically espousing brevity and simplicity in code but they were running way too long. I was wrapping up a section on refactoring and suggesting letting Chat GPT refactor your code when the penny dropped – get it to refactor my prose. I fed it my blog posts, brimming with elaborate, rambling, Prousian prose – eschewing punctuation in favour of cascades of adjectives and metaphorical flights of fancy and asked it to cut it to about 2,500 words. Done. Less Proust more Elmore Leonard. Terse. No adjectives.

Day two

I wanted to build an API. I had a seed project I built with the most bare bones MongoDB/Express API. I’m more front end so I wanted it to guide me through something I’m less familiar with.

I already had a really basic ‘products’ API, not with the full Create, Read, Update, Delete (CRUD) functionality but the ability to add a new product and read the list of results, so the CR bit of CRUD. I wanted to do the whole process via prompts so I started writing prompts for GitHub CoPilot in comments in my TypeScript files. The weirdest part is it actually pr-empted the prompts I would be writing and got them fairly accurate – so asking the questions it was going to answer on my behalf. I had the ‘delete’ function written fairly quickly but it took a few runs at it and a bit of manual tweaking from me – it needed to send a ‘post’ to the API but at first attempt it added a ‘delete’, which made logical sense but didn't actually work. Then I went on to ask it to paginate the list of products I was returning, a common feature of APIs so I tried with a fairly rudimentary prompt:

Write a controller function to get a paginate list of products

It wrote the same ‘return every item’ function I already had. But it did name it ‘getPaginatedProducts’ – So I got more specific. I asked Chat GPT to give me an example of a paginated API response. I used it almost verbatim, just tweaking the data to fit my ‘Products’ format:

/*
  Write a controller function to get a paginated list of products
  Return the list as JSON with the following structure:
  {
  "page": 1,
  "total_pages": 3,
  "per_page": 5,
  "total_items": 12,
  "data": [
    {"_id": 1, "name": "Item 1", "cost": 10},
    {"_id": 2, "name": "Item 2", "cost": 10},
    {"_id": 3, "name": "Item 3", "cost": 10},
    {"_id": 4, "name": "Item 4", "cost": 10},
    {"_id": 5, "name": "Item 5", "cost": 10}
  ]
}
*/

This time it wrote code that worked. The problem was its method of getting “total_items” was to fetch every item in the database and get the array length, defeating the object of paginating. So I asked it to use the ‘countDocuments()’ method of Mongoose – the MongoDB library I was using in my API. It still wrote the same code. So I added another prompt.

/*
Write a controller function to return the total number of products in the database using the "countDocuments" method
*/

With this intermediate step I was then able to get it to write a more performant paginated endpoint.

Day three: AI API

Once I had the full CRUD functionality working and tested in Postman I moved onto building a new endpoint from scratch. I wanted a bit more complexity so decided on a location API with a bit of GeoJson in there.

Naturally - I went to Chat GPT. I gave it the following prompt:

I wish to create an API. It is to contain a list of locations with geojson data describing the location, a title, some body-copy, an image URL, a category, an array of URLs. I would like to search the data on location or proximity to a location, category title and body-copy.

I would like the result to be paginated so I can request a specific page and number of items per page. I would like the response to a search return a JSON Object showing the total number of entries, page-number, items per page, and an array of results.

Would you give examples of the API calls and JSON responses for a typical Create, Read, Update, Delete (CRUD) API?

It gave me all the GET/PUT/POST/PATCH/DELETE calls and responses, all looked good. So I asked it to write me the YAML file for the OpenAPI spec. In less time than I would have spent fixing my bad indentation in yaml I had it. It was pretty good from the start but I asked for a couple of rounds of refinements.

With the YAML file produced I asked for some free, online resources to view and edit the file. It gave me a few options. Having busted past my three freebie APIs on Swagger I went for Apicurio – a great suggestion I’d never heard of.

I started building the Models and Controllers for the new Locations API by pasting the OpenApi yam into a comment as a reference. Again GitHub CoPilot started preempting what I wanted to do – listing the files and tests it thought needed writing.

It took a fraction of the time it would have without AI. It was smart enough to use the request body not the request query as it contained nested Objects. I asked for sample JSON to pass to the request body in postman. It gave me the Golden Gate Bridge, including a bit of copy about it and an image URL.

Day four: Tonite we’re gonna party like it’s 2021!

Today I hit the September 2021 ceiling. I was trying to set up tests for my API having, of course, asked Chat GPT how I should go about it. It suggested I use an NPM package ‘mongodb-memory-server’. I added the package and CoPilot wrote me a test. A Jest Test. But I was using Vitest for unit testing, so I asked it to re-write it in Vitest syntax. “In what syntax?” it replied (I’m paraphrasing a bit). Vitest is under two years old and, as I discovered, ChatGPT’s knowledge hits a cliff-edge precisely two years ago.

I bit the bullet and swapped to Jest – this was going to hit all my tests. So, I asked how to configure mongodb-memory-server. It gave me instructions, which I followed. I ran the test it wrote for me and it failed due to bad syntax. I had installed the latest package, mongodb-memory-serve@8.15.1 – ChatGPT was looking forward to the premier of Squid Game and giving me test syntax for version 6, two sets of breaking changes ago. I started to try and fix the issues, I hit Google and Stack Overflow but quickly stopped myself. The whole point was to test the limits and drive this with AI. So I took a different approach…

Day five: Code Whisperer

I’d read about and watched some videos on AWS CodeWhisperer – like GitHub Copilot but free. So I installed it. It took a bit more setup than just paying your £5 to GitHub, the guides were slightly out of date. I turned off CoPilot and started testing it with a couple of generic functions. I gave it the following prompt.

/* Write a function to take a Object and recursively search for a key in the object.
If the key is found return the value, else return undefined. 
write the function in TypeScript
*/

It gave me.

export const searchKey = (obj: any, key: string): any | undefined => {
    for (let i in obj) {
        if (i === key) {
            return obj[i]
        } else if (typeof obj[i] === "object") {
            return searchKey(obj[i], key)
        }
    }
}

That looked like it should work. So I got it to write a test, leaving it to its own devices, just accepting the suggestions. I got the following failing test:

describe("searchKey", () => {
    it("should return the correct key", () => {
    const obj = {
      a: {
        b: {
          c: "d",
        },
      },
    };
    expect(searchKey(obj, "d")).toBe("a.b.c");
  });

  it("should return undefined if the key is not found ", () => {
    const obj = {
      a: {
        b: {
          c: "d",
        },
      },
    };
    expect(searchKey(obj, "e")).toBe(undefined);
})
});

The function was fine – it wrote the test wrong, it should have asked:

expect(searchKey(obj, "c")).toBe("d");
/* NOT */
expect(searchKey(obj, "d")).toBe("a.b.c");

I gave it another, fairly vague and open to interpretation prompt – I was less concerned with the actual functionality, it was just a second chance for CodeWhisperer to write and test its own code.

This was the prompt:

/*
Write a function called flattenObject which accepts an object and returns a new object with all values flattened.
*/

I haven't specified recursive this time, but it gave me a recursive function, the only bit I contributed was pulling the ‘GenericObject’ type in.

It did this:

export const flattenObject = (obj: GenericObject): GenericObject => {
    let result: GenericObject = {};
    for (let i in obj) {
        if (typeof obj[i] === "object") {
            let temp = flattenObject(obj[i]);
            for (let j in temp) {
                result[i + "." + j] = temp[j];
            }
        } else {
            result[i] = obj[i];
        }
    }
    return result;
}

It looked like it should work so, again, I had CodeWhisperer write a test. This time it wrote one good and one dud test.

describe("flattenObject", () => {
    it("should flatten object", () => {
    const obj = {
      a: {
        b: {
          c: 1,
        },
      },
    };

    expect(flattenObject(obj)).toEqual({ "a.b.c": 1 });
  });

  it("should flatten object with array", () => {
    const obj = {
      a: {
        b: {
          c: [1, 2, 3],
        },
      },
    };
    expect(flattenObject(obj)).toEqual({ "a.b.c": [1, 2, 3] });  
  });
});

The first test was fine but it missed the fact that an a Array is an Object in JavaScript so it created 0, 1 and 2 entries, not an array and gave the following Error:

flattenObject › should flatten object with array

    expect(received).toEqual(expected) // deep equality

    - Expected  - 5
    + Received  + 3

      Object {
    -   "a.b.c": Array [
    -     1,
    -     2,
    -     3,
    -   ],
    +   "a.b.c.0": 1,
    +   "a.b.c.1": 2,
    +   "a.b.c.2": 3,
    }

I gave it a further prompt:

// An Array will be treated as a single object with key

It had a go but, ultimately, needed helping along.

Out of curiosity I swapped over to CoPilot to write the tests – it wrote them correctly and used more real-world mocks, see below:

   const obj = {
        name: "John",
        age: 30,
        address: {
         city: "New York",
         state: "NY"
        },
        hobbies: ["reading", "writing", "coding"]
    }

I have heard good reports about CodeWhiperer and it is more attuned to AWS related tasks, naturally. So far, on a very, very shallow dive, CoPilot is winning. I had no luck getting any guidance on mongodb-memory-server

I searched for Chat GPT alternatives and tried Google Bard. It fell at the first hurdle when I asked it how to set up mongodb-memory-server and told me it didn’t know.

So I tried Bing. This was interesting. I used the prompt:

I would like to write tests for an API, written in Typescript, connecting to MongoDB via Mongoose. I would like to use mongodb-memory-server, version 8.15.1, from this GitHub repo, https://github.com/nodkz/mongodb-memory-server, to run unit tests in Jest. Would you show me example code for the unit tests?

It started writing what looked like a decent response, except for Jasmine not Jest. Then it stopped, the code disappeared and it gave me this response:

“My mistake, I can’t give a response to that right now. Let’s try a different topic.”

I told it it was on the right track and to rerun the query. It did but, again, it was for version 6.

Next I tried Claude AI – this time I got a usable result. The demo code it gave me actually failed, but not because it didn’t run, it just had too many items in the database – something I can at least deal with.

Claude was up to date – it wrote me usable tests and explained its rationale, clearly and concisely. When I had issues, like the database persisting between tests rather than clearing, I explained the problem and it gave me working solutions.

As a caveat a colleague had a less favorable issue with Claude – it told him it couldn't write code. Maybe I asked more politely?

Day 6: Taking back the reigns

I became more interested in getting results than being a prompt purist – and started taking more of a lead in coding, letting Copilot be more of, well, a copilot.

Really, this is how AI works best at present – as your sous chef when cooking up your code, Robin to your Batman. It excels at writing what would be repetitive unit tests, donkey work code which requires no ingenuity but, right now, it isn’t going to replace the developer.

Dots on a screen.