Alex’s Blog

Random musings on life, tech, and anything else

I Am Mildly Bullish On LLMs

Date: Tue Feb 11 00:16:11 CET 2025

Unless you’ve been living under a rock LLMs - large language models - basically super, duper complex AI magic that predicts the next word or token have taken the world by storm. They have come a long way. They’re not perfect. They don’t think. They don’t reason. They aren’t artificially intelligent…

BUT …

I have found them quite useful.

ChatGPT 4-o is pretty good. Work has an enterprise partnership such that we can use it internally and it’s been a game changer way for me to learn things like Terraform - a way to do infrastructure (databases, virtual machines, etc., but defined in code but not nearly as vendor agnostic like a Kubernetes is).

The thing is it takes gobs of money to train these models. OpenAI, the folks behind ChatGPT are spending billions and it’s made Nvidia the most valuable company in the world for a second, it’s in the top 5 now.

But then at the beginning of the year a Chinese hedge fund put out a model called Deepseek R1 with some optimizations that reportedly, allegedly was trained for ~6M USD. Whether true or not it spooked the markets. They’ve since recovered but still.

The problem with LLMs is the interval of time between training and release is so high that often when they’re released if they can’t be fed new data or consume new things they’re almost obsolete being out of date. It would be better if they could ingest, triage, and store only high quality, high signal (low noise) data and be constantly up-to-date.

My Work Use-Case

I had to learn terraform and I had to learn it fast. I was familair with it, sure, but I had never done anything with it end-to-end.

So I turned to ChatGPT to try and have it solve something for me. Then I would run the code and see what happened and learn from the mistakes the LLM made and the errors.

That was the loop:

Prompt LLM -> run LLM generated code -> reprompt given errors -> run LLM generated code …

Now mind you I had spent the large part of my teenage years not growing as a human, making connections, learning skills, etc., nope I spent it a lot like Mr. Anderson in his apartment surrounded by computers.

So I knew a bit of how things should look in an “old school” sort of way:

I needed a HTTP proxy to do SSL termination, I needed it to forward requests to a server on a port and so forth. But how to navigate the arcane world of AWS that was the challenge.

So I prompted the LLM and sanity checked it against what I thought it needed and went to town.

And lo and behold it worked!

After a lot of tweaking, reading documentation fixing things the LLM hallucinated, I got my task done!

It Could Only Get Better From Here

Now imagine if that middle part of double checking the LLM could have been a tighter loop, where it didn’t need me second guessing most of it to make sure things weren’t way out of the norms i.e S3 bucket set to public or something silly.

If the dataset was trained on only terraform data, not docs, and blogposts, and obviously wrong StackOverflow posts that might influence the LLM to hallucinate that a property or method exists when it doesn’t but instead the entire corpus was the documentation, and then curated data fed in by the vendor itself – maybe Terraform in this case – I would have had an LLM tailored specifically to this task and things would have been even quicker.

I think time will only make the LLMs more efficient, and cheaper to run. I think if they get cheap enough and can be made more realtime we’ll have many bespoke LLMs for use with very little hallucination issues.

Feedback

Go Back Home