Thoughts on the AI train
References: https://en.wikipedia.org/wiki/Gartner_hype_cycle
Definitions:
- AI: the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.
- Language Models: a computational model that predicts the probability of a word or sequence of words based on the preceding words.
- Intelligence: the ability to acquire and apply knowledge and skills.
- Knowledge: facts, information, and skills acquired by a person through experience or education; the theoretical or practical understanding of a subject
- AGI: hypothetical type of artificial intelligence (AI) that aims to mimic the cognitive abilities of the human brain.
Ever since OpenAI released ChatGPT 3.5 back in November of 2022 AI has been on the talks of everyone's mind. We talk about how it changes, work, how it affects content creators, and overall our lives. So I thought I will jump on this train to give my thoughts on AI and what I think might happen. This might be a collection of what you already heard from other folks but I wouldn't be human if my thoughts weren't shape by opinions and things I read.
First, I think the definition of AI has changed. Per the definition above AI use to encompass an encompassing variety of task. ChatGPT and other models, are only a subset of that, which is word prediction(really good at it to). I think this causes issues in what is be perceived here. ChatGPTs and others are just narrow focused AI that is extreme good at the next word in a sequence. Language models to me are more Artificial Knowledge then intelligence. Knowledge being they were trained with billions if not trillions of parameters, and through this training(experience of interaction with the user for additional training) they offer a way to display facts and information based on context. That is a form of some intelligence but the mechanics of it is that it is only trying to predict the next word in the sequence based on your question. I think where this is highlighted is in math or computer programming. I used ChatGPT a lot for scripting and light programming of small tools. Its great at the small things but anything bigger it does not do well. This is not a lack of not having the knowledge. Most of these models are trained in lots of code, but programming is more then just putting words next to each other, its about solving a problem, and language models are only about solving the next word. Which to me is not the same. I think this is why now you hear so much about AGI(Artificial general intelligence.) AGI is really what we are looking for in what we see in movies. Where the machine can have knowledge but apply that knowledge and solve a problem on its own. Not to predict the next word but to predict the future solution of a problem. Meaning it actually developed a new step to get to an answer, which to me is different then predicting the next word.
Second, I think we are way pass the hype mode and into close to productivity of LLMs in general use. I think most people seen a graph which is called the Gartner hype cycle. There is a trigger in technology, then it peaks in hype, then it wanes, then its just blah, and then people find an actual use and then its in this mode where it helps and gets produced more and more to help. Most people will either put this somewhere in the downtrend of the peak hype or in the disillusionment stage. I actually think we are way past most of that and are either in the enlighten stage or already in the productivity stage. I think the peak of Hype and then fall really happen once we got ChatGPT 3.5 and then got GPT 4 from OpenAI. Once we got there, most companies got close to GPT 4 or on the same page, and this was around March of 2023. That year, I think most people were hitting walls with it. It can do code really well but not the greatest, its word predictions started being weird due to length of conversations. Meaning that if you talk to it long enough it will just simply extract a verbatim text of what it was trained on. This is how it got in trouble with several news publishers etc. We started to hit walls on just the base models themselves. This eventually leads to the wrapper over the candy.
Wrappers are amazing. They protect most of our goods so that once we unwrap it, we get to taste the delicious goodness inside. Sadly in the language model space, the inside is dull boring and does not taste good to use when we want it to do a thing and it doesn't. However, just like most products sold at department stores, if you wrap it up nicely, the piece of garbage can be worth millions. Seriously, wrappers were the saving grace of the language models. It allow additional context awareness for the subject in which the language model was task to do. We can give it documents to get additional context even if it was not trained on that subset of data. It can search the web, it can do light coding to display graphs and do calculations. This is why I believe we are in the enlighten/productivity stage. Most companies offer some type of AI agent in either a form of travel planner, analysis in the context of your needs, and co-pilot for programmers that can auto fill code based on what is being typed. Are they the greatest, no. They are helping the average person be more productive. There are downsides of course to these wrappers, but they change and get adjusted over time. Overall the wrapper makes it good, and not all wrappers are the same. Some wrappers are good at what they do and this becomes the tool. Some wrappers, you want to take a hammer too. Never the less, this makes LLMs less of a hype and can finally add value to either our work or lives.
Lets close this out with the following. LLM is a subset of AI. Its a different type of AI. If people mention that this is another hype in AI, I don't really know what they mean. AI is an umbrella term and AI has been used for a long time(in other ways and in different forms) now before LLMs were a thing. LLMs are good but require a wrapper to make them an actual tool that can be used(in some cases just the chat is fine). The hype of LLM to me is done, but it is in my opinion, we move to getting good productive tools out of it.