Zero to Hero in AI
Begin your journey into the world of AI, Machine learning, Deep learning and Gen AI
Lately, while I was talking to my friends, colleagues, and well-wishers, I was asked how to start their journey into the world of AI. We have so many terms to understand and fundamentals to learn, with so much information available but not in exactly the right order. This is an attempt to help anyone wanting to start their journey into AI. In general, with so much AI being mixed into our reality day by day, it's necessary for us to know about the history and the basics of how it is all coming together. As such, millions of users have signed up for ChatGPT (if you have not heard of or not used it, please head to ChatGPT and start interacting) or are using it for free. Many have wondered how it all came together and what it would require them to create AI models and tools as such.
I have tried to put together an overview of different kinds of contributors to the AI landscape, as shown below, and what needs to be learned based on which category you fall into. The very least I suggest is that you learn "prompt engineering.” This will help you utilize LLMs like ChatGPT to the fullest extent possible.
Let’s look at how one would play their part in the world of AI. I have divided the responsibilities and usage into 4 categories, as shown below. You can pick and choose which category you belong to, depending on your choice.
Figure 1
I am providing a list of the courses that are needed to be taken based on the category in that same order. Please pick your category and start your journey of learning. It will take a good amount of time to complete this journey. Keep aside some time each day that will help you reach the goal. You can reach out to me at any time via LinkedIn, Twitter or here on Substack.
However, keep one thing in mind: this categorization helps understand the ecosystem of machine learning model development and usage from a very high level. However, in reality, it is more nuanced, with overlaps between roles, and this landscape is rapidly evolving. I have tried to do as much as possible to get you started.
Also, if you do not understand some of the terms in the picture above, do not worry. Once you finish the courses outlined in this article, you will be equipped with all the relevant information that will help you personally and professionally. I will try to define the basic terms below to start off, which will help you kick-start.
Differences between machine learning, data science, deep learning, and generative AI:
Machine learning mainly deals with data analysis and predictions, segmentation or classification, clustering, and many other related algorithms. For example, sentiment analysis is a machine learning algorithm that reads and analyzes the data from movie reviews and classifies them as "Neutral"” "Positive" or "Negative". There is no interaction from the user in this aspect. Amazon's recommendation algorithms and Netflix's recommendation algorithms fall into this category. Sometimes this is also classified as "classical" machine learning. Classical machine learning includes techniques like decision trees, logistic regression, and support vector machines.
I kept referring to data science when I mentioned machine learning. You need to keep in mind that not all data scientists are machine learning experts. Data science is a broad field that encompasses various disciplines, including statistics, data analysis, and data visualization, along with machine learning. While some data scientists specialize in machine learning, others may focus more on areas like data cleaning, statistical analysis, or the creation of informative data visualizations. Machine learning is an important aspect of data science, but it's just one of many tools in a data scientist's skill set. This highlights the nuance I mentioned earlier, where there is an overlap of roles.
Large language models like ChatGPT are deeply rooted in deep learning methodologies, which are a specialized branch of machine learning. This branch is based on "neural networks," where one function (named a neuron analogous to the brain) is interconnected in layers to process and learn from data. These networks are capable of learning complex patterns from large amounts of data, making them effective for tasks like language modeling.
Now, let's put this in a visual form so that it sticks.
Figure 2
There is a little nuance that you need to note regarding GenAI. In the picture, I showed GenAI as it overlaps with machine learning, neural networks, and deep learning. That is true in most cases. However, there are some examples of GenAI, that won't always use neural networks. One is Markov chain models (generate sequences of text or music based on the probability of each item following the preceding one), and another is Bayesian networks, used in medical diagnosis. We can revisit these in future articles. Keep this information handy.
Armed with a high-level overview of who does what and how the different layers of AI are interconnected, let's jump into the learning pathways. You can pick and choose whichever pathway you want to take. However, the minimum requirement, as I suggested, is prompt engineering, which helps with generative AI tool usage. Let's start with that. If you are satisfied with that alone and just want to be the user, you can skip the rest of the learning modules. Otherwise, follow through.
Generative AI:
If you really want to get a very high-level overview without delving into details, you can do the following: Start with Andrew Ng's 1-hour GenAI course before moving on to Google's 8-hour course. You can take the first one at least and skip to the How the LLMs Work part.
Andrej Karpathy's course on Intro to LLMs: However, this might require a pre-requisite knowledge of what parameters, weights, and biases mean. He covers them to some extent, but I think knowing about them beforehand would be helpful. You will also hear references to neural networks; however, he covers that in the beginning of the video. Here is a quick introduction to those terms:
Parameters:
In machine learning, parameters are the parts of the model that are learned from the training data. Think of them as the dials or knobs a model can adjust to make better predictions. For example, if you were trying to predict the price of a house based on its size, the parameter might be a factor that you multiply the size by to estimate the price.
Weights:
Weights are a type of parameter. In the context of a neural network, weights determine the strength of the connection between two neurons. Imagine you're trying to decide if you should wear a jacket when you go outside. You might consider two factors, the weather is cold (a bit important) and whether it's raining (very important). The "weight" you give to the cold might be lower than the weight for rain. In a neural network, if the weather being cold has a lower weight and it's raining has a higher weight, the network will pay more attention to the rain when deciding whether to "predict" that you should wear a jacket.
Biases:
A bias is another type of parameter that is added to the weighted sum before applying the activation function in a neural network. The bias allows the model to shift the activation function to the left or right, which can be crucial for fitting the model to the data. Using the previous example, let's say you're someone who gets cold easily. Even if it's not very cold outside, you might have a personal bias towards wearing a jacket. In a neural network, a bias would allow the network to be more likely to predict wearing a jacket, even if the weighted sum of the inputs (like the temperature and rain) isn't very high.
Let's build GPT (Andrej Karpathy's YT video): This explains about Transformers, a technique that has helped build ChatGPT like systems and he walks through how to build a smaller version of GPT, locally.
If you want to explore further on generative AI, you can continue with the following in this same order:
Use the following segregation for each section as shown in Figure 1. This pic is courtesy of Shubham Saboo
Prompt Engineering:
Since you have been introduced to Generative AI, prompt engineering in the above courses, if you want to dive deep into the fundamentals and get your hands dirty, follow the path in the same order.
First and foremost, for anything you want to do with code in AI, machine learning, or Generative AI (hands-on), you would need to know Python. If you have not learned Python, start here.
Python:
Corey Schafer's youtube channel (Refer as needed)
Once you have finished Python, jump on to the courses listed below. You can skip the math for the ML part and start with machine learning; however, if you want to learn more, you can refer to the links given below for mathematical aspects.
If you are already familiar with or want to jump in on the deep learning or AI aspects, please refer to those sections as you like.
Math for ML:
Machine Learning:
Foundational courses:
Machine learning crash course with Tensorflow APIs, a Google foundational course
You can see how a machine learning model is created and used in the graphic provided by “The Ravit show” as shown below.
Deep Learning:
The little book of deep learning, a book
Get Deeper into deep learning, a book on GitHub
Data science:
Artificial Intelligence:
This starts off with neural networks, deep learning, computer vision, and finally NLP techniques.
The list of courses highlighted here is in no way comprehensive. However, these will help you start off on the right path to becoming well-versed in AI-related terminology and inner workings, considering that you have taken them. Even if you have not taken all, you can still gain the minimum knowledge that you can build on by following the latest on Twitter, Reddit, and others. Many people share their wealth of knowledge. All you have to do is follow them. If you like this, please subscribe and forward it to others. Please leave feedback on any specific topic you want me to cover for you.