Artificial Intelligence: From Model to Mind

Artificial Intelligence: From Model to Mind Featured Image

The Toddler Years

Scientists and engineers were like patient parents, teaching computers simple AI tricks. Progress was incremental, and many people doubted that AI could ever become truly useful on a large scale. 

Deep learning, a subset of machine learning, had existed for decades but had limited success due to insufficient computational power and data. However, around 2012, advancements in hardware (particularly GPUs), the availability of large datasets, and novel algorithms led to a renaissance in AI. 

Researchers discovered that deep learning techniques could significantly improve performance across various tasks, such as image recognition and natural language processing. It was like discovering a new way to teach our toddler AI, enabling it to learn faster and better. That said, you still needed different models for different modes, meaning a specific model for text, another one for images, and another one for video.

Between 2012 and 2018, scientists around the world worked tirelessly, driven by their curiosity and passion. They wrote research papers and collaborated, much like a group of wizards sharing their secret spells. In 2017, a particularly powerful spell was developed at Google Brain. It was called the Transformer model which considers the relevance of each word or phrase it encounters based on the surrounding context, not just direct cues. The Transformer architecture introduced mechanisms that led to big breakthroughs. Also, you could now use one model for multiple modes, be it text, image, or video.

The transformer architecture is the cornerstone for many recent advancements in AI, including ChatGPT. GPT literally stands for Generative Pre-Trained Transformers.

Starting with GPT-2 in 2019, each new version of GPT became smarter and more impressive. Developers eagerly adopted these models, witnessing gradual but significant improvements. However, the true “AHA!” moment arrived in late 2022. This was when ChatGPT’s capabilities reached a level where it could truly wow consumers with its intelligence and usability. This was the moment generative AI stepped out into the world, no longer just an experiment, but a part of our collective digital future.

Data is NOT the new oil. IT’S MUCH MORE VALUABLE

Imagine a young student who thrives on a steady diet of books and learning materials. The more data our AI has, the faster and smarter it can become. 

There are two ways to make our AI models smarter: by making the models bigger, i.e by increasing compute, and by giving them lots of good-quality data to learn from.

But you don’t always have data. 

Some problems are like complex puzzles that even the smartest student, or the smartest AIs can’t solve right away. So, researchers are teaching AI to learn by experimenting and trying again—much like a child learning through play. This method, called reinforcement learning, helps AI models get smarter without needing every answer handed to them.

Imagine a thrilling race where the competitors are powerful AI models from different companies.  Right now, you have all the Big Tech, each with their own strengths and weaknesses, competing for the podium. For instance, Google Gemini is great at creating stories, while ChatGPT excels at answering questions.

But who will win this race? 

It’s all about resources and talent. The companies with the most compute and the smartest teams will have the edge. It’s like a marathon where having the best shoes and the best coaches makes all the difference.

In this race, having the best “running shoes” means having the best computer chips. 

Companies like NVIDIA are making these chips that can handle large parallel tasks and process large datasets. Some companies, like Google, leverage their TPU chips that are optimised for transformer architecture–the very model behind GPT. It’s like having a secret weapon that gives them a competitive advantage.

Apple is like the dark horse in this race. While others are focused on big, powerful models, Apple is working on smaller, smarter models that can run on your phone. Imagine having a tiny genius in your pocket, helping you with everyday tasks. This could change the game, allowing smart AI to be accessible to everyone, everywhere.

The ultimate prize in this AI adventure is reasoning. This is the ability to think and solve problems in a way that goes beyond just following rules. It’s like teaching our AI to not just learn facts but to understand and create new ideas.

To achieve this, scientists are creating environments where AI can experiment and learn, much like a scientist in a lab. They’re teaching AI to not just memorise but to think, reflect, and improve. This is a big challenge and opens the doors to exciting research and innovative approaches.

Share via: