Generative AI using python
Module1
Difference Between Traditional AI and Generative AI
Traditional AI | Generative AI |
---|---|
AI is used to create intelligent systems that can perform those tasks which generally require human intelligence. | It generates new text, audio, video, or any other type of content by learning patterns from existing training data. |
The purpose of AI algorithms or models are to mimic human intelligence across wide range of applications. | The purpose of generative AI algorithms or models is to generate new data having similar characteristics as data from the original dataset. |
Overview of Generative Adversarial Network
How does a GAN work?
GANs train by having two networks the Generator (G) and the Discriminator (D) compete and improve together. Here's the step-by-step process
1. Generator's First Move
The generator starts with a random noise vector like random numbers. It uses this noise as a starting point to create a fake data sample such as a generated image. The generator’s internal layers transform this noise into something that looks like real data.
2. Discriminator's Turn
The discriminator receives two types of data:
- Real samples from the actual training dataset.
- Fake samples created by the generator.
D's job is to analyze each input and find whether it's real data or something G cooked up. It outputs a probability score between 0 and 1. A score of 1 shows the data is likely real and 0 suggests it's fake.
3. Adversarial Learning
- If the discriminator correctly classifies real and fake data it gets better at its job.
- If the generator fools the discriminator by creating realistic fake data, it receives a positive update and the discriminator is penalized for making a wrong decision.
4. Generator's Improvement
- Each time the discriminator mistakes fake data for real, the generator learns from this success.
- Through many iterations, the generator improves and creates more convincing fake samples.
5. Discriminator's Adaptation
- The discriminator also learns continuously by updating itself to better spot fake data.
- This constant back-and-forth makes both networks stronger over time.
6. Training Progression
- As training continues, the generator becomes highly proficient at producing realistic data.
- Eventually the discriminator struggles to distinguish real from fake shows that the GAN has reached a well-trained state.
- At this point, the generator can produce high-quality synthetic data that can be used for different applications.
Discriminative vs Generative Models
What are Discriminative Models?
Discriminative models are ML models and, concentrate on modeling the decision boundary between several classes of data using probability estimates and maximum likelihood. These types of models, mainly used for supervised learning, are also known as conditional models.
Discriminative models are not much affected by the outliers. Although this makes them a better choice than generative models, it also leads to misclassification problem which can be a big drawback.
Popular Discriminative Models
Logistic Regression
Support Vector Machines
K-nearest Neighbor (KNN)
What are Generative Models?
Generative models are ML models and, as the name suggests, aim to capture the underlying distribution of data, and generate new data comparable to the original training data. These types of models, mainly used for unsupervised learning, are categorized as a class of statistical models capable of generating new data instances.
The only drawback of generative models, when compared to discriminative models, is that they are prone to outliers.
Popular Generative Models
Bayesian Network
Generative Adversarial Network (GAN)
Variational Autoencoders (VAEs)
Autoregressive model, Nave Bayes, Markov random field, Hidden Markov model (HMM), Latent Dirichlet Allocation (LDA) are few other examples of the commonly used generative models.
Difference Between Discriminative and Generative Models
Characteristic | Discriminative Models | Generative Models |
---|---|---|
Objective | Focus on learning the boundary between different classes directly from the data. Their primary objective is to classify input data accurately based on the learned decision boundary. | Aim to understand the underlying data distribution and generate new data points that resemble the training data. They focus on modeling the process of data generation, allowing them to create synthetic data instances. |
Probability Distribution | Estimates the parameters of probability P(Y|X) from the training dataset. | Calculates the posterior probability P(Y|X) using the Bayes Theorem. |
Handling Outliers | Relatively robust to outliers | Prone to outliers |
Property | They do not possess generative properties. | They possess discriminative properties. |
Applications | Commonly used in classification tasks, such as image recognition and sentiment analysis. | Commonly used in tasks like data generation, anomaly detection, and data augmentation, beyond traditional classification tasks. |
Examples | Logistic regression, Support vector machines, Decision trees, neural nets etc. | Variational Autoencoders (VAEs), Generative adversarial network (GAN), Nave Bayes etc. |
The Role of Probability Distribution in Generative Models
What is Probability Distribution?
Types of Probability Distributions
There are two types of probability distributions −
- Discrete Probability Distributions
- Continuous Probability Distributions
Discrete Probability Distributions
Discrete probability distributions are mathematical functions that describe the probabilities of different occurrences from a discrete or categorial random variables.
Discrete probability distribution includes only those values with a possible probability. In simple words, it does not include any value with zero probability. For example, 5.5 is not a possible outcome of dice rolls, hence it does not include as a probability distribution of dice rolls.
The total of the probabilities of all possible values in a discrete probability distribution is always one.
common discrete probability distributions
Discrete Probability Distribution | Explanation | Example |
---|---|---|
Bernoulli Distribution | It describes the probability of success (1) or failure (0) in a single experiment. | The outcome of a single coin flip. |
Binomial Distribution | It models the number of successes in a fixed number of trials n with p probability. | The number of times it comes heads when you toss a coin 10 times. |
Poisson Distribution | It predicts the k number of events occurring in a fixed interval of time or space. | The number of emails messages received per day. |
Geometric Distribution | It represents the number of trials needed to achieve the first success in a sequence of trials. | The number of times a coin is flipped until it lands on heads. |
Hypergeometric Distribution | It calculates the probability of drawing a specific number of successes from a finite population. | The number of red balls drawn from a bag of mixed colored balls. |
Continuous Probability Distributions
Continuous probability distributions are mathematical functions that describe the probabilities of different occurrences within a continuous range of values.
This includes an infinite number of possible values. For example, in the interval [4, 5] there are infinite values between 4 and 5.
common continuous probability distributions
Continuous Probability Distribution | Explanation | Example |
---|---|---|
Continuous Uniform Distribution | It assigns equal probability to all values within equal-sized interval. | The height of a person between 5 to 6 feet. |
Normal (Gaussian) Distribution | It forms a bell-shaped curve and describes the data clustered around the mean and symmetrical tails. | IQ scores |
Exponential Distribution | It models the time between events in a Poisson process, where events occur at a constant rate. | The time until the next customer arrives. |
Log-normal Distribution | It represents the right-skewed data when plotted on a logarithmic scale. | Stock prices, income distributions, etc. |
Beta Distribution | It describes the random variables constrained to a finite interval. It is often used in Bayesian statistics. | The probability of success in a binomial trial. |
Use of Probability Distributions in Generative Modeling
Probability distributions play a crucial role in generative modeling.
- Data Distribution − Generative Models aim to capture the underlying probability distribution of data from which the samples are taken.
- Generating New Samples − Once understanding the data distribution is done, generative models can generate new data comparable to the original dataset.
- Evaluation and Training − Probability distributions are used to evaluate and train generative models. Evaluation metrics such as likelihood, perplexity, and Wasserstein distance are used to evaluate the quality of generated samples compared to the original dataset.
- Variability and Uncertainty − Probability distributions are used to find the variability and uncertainty present in the data. Generative models can use this information to generate distinct and realistic samples.
Introduction to PyTorch framework for deep learning
Features
The major features of PyTorch are mentioned below −
Easy Interface − PyTorch offers easy to use API; hence it is considered to be very simple to operate and runs on Python. The code execution in this framework is quite easy.
Python usage − This library is considered to be Pythonic which smoothly integrates with the Python data science stack. Thus, it can leverage all the services and functionalities offered by the Python environment.
Computational graphs − PyTorch provides an excellent platform which offers dynamic computational graphs. Thus a user can change them during runtime. This is highly useful when a developer has no idea of how much memory is required for creating a neural network model.
PyTorch is known for having three levels of abstraction as given below −
- Tensor − Imperative n-dimensional array which runs on GPU.
- Variable − Node in computational graph. This stores data and gradient.
- Module − Neural network layer which will store state or learnable weights.
The following are the advantages of PyTorch −
- It is easy to debug and understand the code.
- It includes many layers as Torch.
- It includes lot of loss functions.
- It can be considered as NumPy extension to GPUs.
- It allows building networks whose structure is dependent on computation itself.
To create a simple neural network with one hidden layer developing a single output unit.
Step 1
import the PyTorch library using the below command −
import torch
import torch.nn as nn
Step 2
Define all the layers and the batch size to start executing the neural network as shown below −
# Defining input size, hidden layer size, output size and batch size respectively
n_in, n_h, n_out, batch_size = 10, 5, 1, 10
Step 3
As neural network includes a combination of input data to get the respective output data, we will be following the same procedure as given below −
# Create dummy input and target tensors (data)Step 4
Create a sequential model with the help of in-built functions. Using the below lines of code, create a sequential model −
# Create a model
model = nn.Sequential(nn.Linear(n_in, n_h), nn.ReLU(), nn.Linear(n_h, n_out), nn.Sigmoid())
Step 5
Construct the loss function with the help of Gradient Descent optimizer as shown below −
#Construct the loss function
criterion = torch.nn.MSELoss()
# Construct the optimizer (Stochastic Gradient Descent in this case) optimizer = torch.optim.SGD(model.parameters(), lr = 0.01)
Step 6
Implement the gradient descent model with the iterating loop with the given lines of code −
# Gradient Descent
for epoch in range(50): # Forward pass: Compute predicted y by passing x to the model y_pred = model(x) # Compute and print loss loss = criterion(y_pred, y) print('epoch: ', epoch,' loss: ', loss.item()) # Zero gradients, perform a backward pass, and update the weights. optimizer.zero_grad() # perform a backward pass (backpropagation) loss.backward() # Update the parameters optimizer.step()
Step 7
The output generated is as follows −
epoch: 0 loss: 0.2545787990093231 epoch: 1 loss: 0.2545052170753479 epoch: 2 loss: 0.254431813955307 epoch: 3 loss: 0.25435858964920044 epoch: 4 loss: 0.2542854845523834 epoch: 5 loss: 0.25421255826950073 epoch: 6 loss: 0.25413978099823 epoch: 7 loss: 0.25406715273857117 epoch: 8 loss: 0.2539947032928467 epoch: 9 loss: 0.25392240285873413 epoch: 10 loss: 0.25385022163391113 epoch: 11 loss: 0.25377824902534485 ....
Module 2
Architecture of GAN
GANs consist of two main models that work together to create realistic synthetic data which are as follows:
1. Generator Model
The generator is a deep neural network that takes random noise as input to generate realistic data samples like images or text. It learns the underlying data patterns by adjusting its internal parameters during training through backpropagation. Its objective is to produce samples that the discriminator classifies as real.
The Role of Generator in GAN Architecture
The first primary part of GAN architecture is the Generator.
Generator: Function and Structure
The primary goal of the generator is to generate new data samples that are intended to resemble real data from the dataset. It begins with a random noise vector and transforms it through fully connected layers like Dense or Convolutional layers to generate synthetic data sample.
Generator: Layers and Components
Listed below are the layers and components of the generator neural network −
- Input Layer − The generator receives a low dimensionality random noise vector or input data as input.
- Fully Connected Layers − The FLC is used to increase the input noise vector dimensionality.
- Transposed Convolutional Layers − These layers are also known as deconvolutional layers. It is used for upsampling i.e., to generate an output feature map having greater spatial dimension than the input feature map.
- Activation Functions − Two commonly used activations functions are: Leaky ReLU and Tanh. The Leaky ReLU activation function helps in decreasing the dying ReLU problem, while the Tanh activation function makes sure that the output is within a specific range.
- Output Layer − It produces the final data output like an image of a certain resolution.
Generator Loss Function: The generator tries to minimize this loss:
JG=−m1Σi=1mlogD(G(zi))
where
measure how well the generator is fooling the discriminator. is the generated sample from random noise is the discriminator’s estimated probability that the generated sample is real.
2. Discriminator Model
The discriminator acts as a binary classifier helps in distinguishing between real and generated data. It learns to improve its classification ability through training, refining its parameters to detect fake samples more accurately. When dealing with image data, the discriminator uses convolutional layers or other relevant architectures which help to extract features and enhance the model’s ability.
The Role of Discriminator in GAN Architecture
The second part of GAN architecture is the Discriminator.
Discriminator: Function and Structure
The primary goal of the discriminator is to classify the input data as real or generated by the generator. It takes a data sample as input and gives a probability as output that indicates whether the sample is real or fake.
Discriminator: Layers and Components
Listed below are the layers and components of the discriminator neural network −
- Input Layer − The discriminator receives a data sample from either the real dataset or the generator as input.
- Convolutional Layers − It is used for downsampling the input data to extract relevant features.
- Fully Connected Layers − The FLC is used to process the extracted features and make a final classification.
- Activation Functions − It uses Leaky ReLU activation function to address the vanishing gradient problem. It also introduces non-linearity.
- Output Layer − As name implies, it gives a single probability value between 0 and 1 as output that indicates whether the sample is real or fake.
Discriminator Loss Function: The discriminator tries to minimize this loss:
measures how well the discriminator classifies real and fake samples. is a real data sample. is a fake sample from the generator. is the discriminator’s probability that is real. is the discriminator’s probability that the fake sample is real.
MinMax Loss
GANs are trained using a MinMax Loss between the generator and discriminator:
where,
is generator network and is is the discriminator network = true data distribution = distribution of random noise (usually normal or uniform) = discriminator’s estimate of real data = discriminator’s estimate of generated data
The generator tries to minimize this loss (to fool the discriminator) and the discriminator tries to maximize it (to detect fakes accurately).
Types of GANs
There are several types of GANs each designed for different purposes. Here are some important types:
1. Deep Convolutional GAN (DCGAN)
Deep Convolutional GANs (DCGANs) are among the most popular types of GANs used for image generation.
They are important because they:
- Uses Convolutional Neural Networks (CNNs) instead of simple multi-layer perceptrons (MLPs).
- Max pooling layers are replaced with convolutional stride helps in making the model more efficient.
- Fully connected layers are removed, which allows for better spatial understanding of images.
DCGANs are successful because they generate high-quality, realistic images.
Need for DCGANs:
WGAN architecture
WGANs use the Wasserstein distance, which provides a more meaningful and smoother measure of distance between distributions.
- γ denotes the mass transported from x to y in order to transform the distribution Pr to Pg.
- denotes the set of all joint distributions γ(x, y) whose marginals are respectively Pr and Pg.
Benefits of WGAN algorithm over GAN
- WGAN is more stable due to the Wasserstein Distance which is continuous and differentiable everywhere allowing to perform gradient descent.
- It allows to train the critic till optimality.
- There is still no evidence of model collapse.
- Not struck in local minima in gradient descent.
- WGANs provide more flexibility in the choice of network architectures. The weight clipping, generators architectures can be changed according to choose.
Conditional GAN (cGAN) extends the GAN framework by including the condition information like class labels, attributes, or even other data samples, into both the generator and the discriminator networks.
With the help of these conditioning information, Conditional GANs provide us the control over the characteristic of the generated output.
Architecture of Conditional GANs
Like traditional GANs, the architecture of a Conditional GAN consists of two main components: a generator network and a discriminative network.
The only difference is that in Conditional GANs, both the generator network and discriminative network receive additional conditioning information y along with their respective inputs. Lets understand it with the help of this diagram −
The Generator Network
The generator networks, as shown in the above diagram, takes two inputs: a random noise vector which is sampled from a predefined distribution and the conditioning information "y". It now transforms it into synthetic data samples. Once transformed, the goal of the generator is to not only produce data that is identical to real data but also align with the provided conditional information.
The Discriminator Network
The discriminator network receives both real data samples and fake samples generated by the generator, along with the conditioning information "y".
The goal of the discriminator network is to evaluate the input data and tries to distinguish between real data samples from the dataset and fake data samples generated by the generator model while considering the provided conditioning information.
Conditional Information
Conditional information often denoted by "y" is an additional information which is provided to both generator network and discriminator network to condition the generation process. Based on the application and the required control over the generated output, conditional information can take various forms.
Types of Conditional Information
Some of the common types of conditional information are as follows −
- Class Labels − In image classification tasks, conditional information "y" may represent the class labels corresponding to different categories. For example, in handwritten digits dataset, "y" could indicate the digit class (0-9) that the generator network should produce.
- Attributes − In image generation tasks, conditional information "y" may represent specific attributes or features of the desired output, such as the color of objects, the style of clothing, or the pose of a person.
- Textual Descriptions − For text-to-image synthesis tasks, conditional information "y" may consist of textual descriptions or captions describing the desired characteristics of the generated image.
Applications of Conditional GANs
Listed below are some of the fields where Conditional GANs find its applications −
Image-to-Image Translation
Conditional GANs are best suited for tasks like translating images from one domain to another. Translating images includes converting satellite images to maps, transforming sketches into realistic images, or converting day-time scenes to night-time scenes etc.
Semantic Image Synthesis
Conditional GANs can condition on semantic labels, hence they can generate realistic images based on textual descriptions or semantic layouts.
Super-Resolution and Inpainting
Conditional GANs can also be used for image super-resolution tasks in which low-resolution images are transformed into similar high-resolution images. They can also be used for inpainting tasks in which, based on contextual information, missing parts of an image are filled in.
Style Transfer and Editing
Conditional GANs allow us to manipulate specific attributes like color, texture, or artistic style while preserving other aspects of the image.
Challenges in using Conditional GANs
Conditional GANs offer significant advancements in generative modeling but they also have some challenges. Lets see which kind of challenges you can face while using Conditional GANs −
Mode Collapse
Like traditional GANs, Conditional GANs can also experience mode collapse. In mode collapse, the generator learns to produce limited varieties of samples and fails to capture the entire data distribution.
Conditioning Information Quality
The effectiveness of Conditional GANs depends on the quality and relevance of the provided conditioning information. Noisy or irrelevant conditioning information can lead to poor generation outputs.
Training Instability
The training instability issues observed in traditional GANs can also be faced by Conditioning GANs. To avoid this, CGANs require careful architecture design and training approaches.
Scalability
With the increased complexity of conditioning information, it becomes difficult to handle Conditional GANs. It then requires more computational resources.
Comments
Post a Comment