Monday, April 29, 2019

One Hot Encoding

One hot encoding is a process by which categorical variables are converted into a form that could be provided to ML algorithms to do a better job in prediction.

CATEGORICAL DATA


Lets take a dataset of food names. In this dataset, if there was another food name it would have categorical value as 4.As the no of unique value increases, the categorical values increases.

Saturday, April 27, 2019

Slideshow using Notebook



The slides are created in notebooks like normal, but you'll need to designate which cells are slides and the type of slide the cell will be. In the menu bar, click View > Cell Toolbar > Slideshow to bring up the slide cell menu on each cell.


Data dimensions





SCALARS
  • They have 0 dimensions
  • Ex a persons height would be a scalar

1      2.4      -0.3

Friday, April 26, 2019

Bag of words






The Problem with Text
A problem with modeling text is that it is messy, and techniques like machine learning algorithms prefer well defined fixed-length inputs and outputs.
Machine learning algorithms cannot work with raw text directly; the text must be converted into numbers. Specifically, vectors of numbers.
In language processing, the vectors x are derived from textual data, in order to reflect various linguistic properties of the text.
This is called feature extraction or feature encoding.
A popular and simple method of feature extraction with text data is called the bag-of-words model of text.

Thursday, April 25, 2019

ERROR FUNCTION IN NN



  • In most learning networks, error is calculated as the difference between the actual output and the predicted output.
  • The error function is which tells us how far are we from the solution.
  • The function that is used to compute this error is known as loss function.
  • Different loss functions will give different errors for the same prediction and thus would have a considerable effort on the performance of the model.
EXAMPLE:
Imagine, we are standing on top of a mountain(mount Everest) and we want to descend.It is not that easy and it is cloudy and it is big and we cant see the big picture.We would look at all the possible directions where we can walk.

Wednesday, April 24, 2019

Industries to be revolutionized by artificial intelligence



Artificial intelligence (AI) and machine learning (ML) have a rapidly growing presence in today’s world, with applications ranging from heavy industry to education. From streamlining operations to informing better decision making, it has become clear that this technology has the potential to truly revolutionize how the everyday world works.

According to a panel of Forbes Technology Council members, here are 13 industries that will soon be revolutionized by AI.

1. Cybersecurity

The enterprise attack surface is massive. With its power to bring complex reasoning and self-learning in an automated fashion at massive scale, AI will be a game-changer in how we improve our cyber-resilience. - Gaurav Banga, Balbix

Monday, April 22, 2019

GAN



  • They can make entirely new image that are realistic, even they never been seen before
  • Most of the application for GANs have been images
STACKGAN 
  • Takes a textual description of the bird and than generating a high resolution of a bird matching that description.
  • These pictures have never been seen before. It is not running a image search on a database, infact GAN is drawing a probability distribution over all hypothetical images matching that description
  • We can keep running the GAN to get more images.

Tuesday, April 16, 2019

Sage Maker Services


SERVICES PROVIDED BY SAGEMAKER

1) Provides jupyter notebook instance
  • Used to explore and process data
2) API
  •  This simplifies computationally difficult task like train and deploy machine learning model

Machine Learning Workflow



Machine Learning Workflow consists of 3 components
  • Explore and process data
  • Modeling
  • Deployment
EXPLORE AND PROCESS DATA
This component consists of exploring and processing the data.

Retrieve
The first step is to retrieve the data, which includes test and train dataset. Lets take an example of housing dataset which contains csv files. We need to download the data from the source.