Studies

Posts

DISASTER ALERT SYSTEM

May 04, 2021

Introduction From my Bachelor year, I have been seeing many of my close friends and relations dying due to road accidents. This is a serious problem which concerns me a lot. I decided from then on to follow this topic and the solutions that the major brains are figuring out. This post is a solution for more border problems where we train a model for detecting disasters from tweets. The project proposal can be found here . This is the first door that opens a wide project with numerous possibilities. I used the existing tweets dataset from Kaggle to train my model and the COLAB Notebook Dataset The following is the dataset Fig 1: Disaster Twiter feed Dataset The dataset consists of the id of the tweet, keywords in the tweet text, location of the tweet and the tweet text. The training set of the tweet alone has the target parameter which holds the resolution of the tweet. If the target is 0 th...

Sentiment Prediction using Naive Bayes Algorithm

April 27, 2021

Introduction This is a post about Sentiment Prediction work that I did with the Naive Bayes Classifier. The dataset I used for the experiments were on this sentiment labelled dataset . Which had 3 types of review datasets: IMDB Movie Reviews Amazon Product Reviews Yelp Reviews I used the IMDB Movie Reviews dataset whose textual variation can be found here . The Jupyter Notebook having the outcome of my experimentation is committed in this GitHub Repository . Dataset The dataset consists of review and sentiment pairs as follows Figure 1: The IMDB review dataset The reviews consist of movie reviews having both positive and negative sentiments. Each review is labelled with the respective sentiments which have positive as 0 and negative as 1. Goals The major part of this project was to understand the working of the Naive Bayes Classifier. The following are the important MVPs of the project: Predicting the sentiment for a given review. Dividing the dataset into train, dev and tes...

Overfitting in Machine Learning

April 08, 2021

Understanding Overfitting using Higher order linear regression This is a project done to understand Overfitting using linear regression and Polynomial model using scikit learn. We will also see how to avoid overfitting using regularization techniques. Please do refer to this Jupiter notebook for the whole code First, we will understand overfitting... Overfitting Overfitting is the phenomenon where the model will perfectly coincide with the training data and will have large errors for unseen data. Summing up small or 0 training error but very high validation error. Meaning it memorized the training data. If this happens then the model will only work if the input lines in the training data and the model will not behave as expected for unseen validation data. This can usually happen if there is a complex model or a small training dataset. If a model is overfit supplying more data for tr...

CIFAR10 Image Classifier using PyTorch

March 30, 2021

This is a blog post on the work that I have done in programming an image classifier for the CIFAR10 dataset. I had followed many notebooks, tutorials, and guidance from my Teaching Assistant Mondol, Md Ashaduzzaman Rubel, and my fellow classmate Subbiah Sharavanan, Abishek Pichaipillai for creating this notebook . The Work First I had to work on a base tutorial code available on the PyTorch website here . It is a beginner tutorial for classifying images into their respective labels which are present in the CIFAR10 dataset. Also need to try different optimizers provided by PyTorch and document the results and infer some conclusions. CIFAR10 dataset: The CIFAR10 dataset consists of 6000 images with which there are 1000 tests and 5000 training images with labels. It has images from 10 classes namely plane, car, bird, cat, deer, dog, frog, horse, ship, and truck. The ...

Training a Machine

March 05, 2021

Introduction A day ago I learned how to train a machine to close on a prediction. I thought it was easy as I just coded the math to narrow down on a prediction from the loss or the inaccuracy. The best thing was that I thought it would be very complicated. But I had understood it I did that tedious math by hand and coded it myself. The steps were as follows Analyze and predict the function. Assume certain values for the bias, constants, and step. Plugin the input to the prediction. Calculate the loss. Calculate the gradients for each of the constants and bias. Update the constants and the Bias by subtracting the respective gradients. Now repeat from step 3 till you get 0 loss. From the above steps, there are critical things that we need to note To analyze and predict the function of the data was done for me. The day was also a linear result so the function resulting was linear and predictable with the known function y = wx + b where w is the weight constant and b is the bias constant n...

Search This Blog