Monday, April 29, 2019

One Hot Encoding

One hot encoding is a process by which categorical variables are converted into a form that could be provided to ML algorithms to do a better job in prediction.

CATEGORICAL DATA


Lets take a dataset of food names. In this dataset, if there was another food name it would have categorical value as 4.As the no of unique value increases, the categorical values increases.


What is Categorical Data?
  • Categorical data are variables that contain label values rather than numeric values.
  • The number of possible values is often limited to a fixed set.
  • Categorical variables are often called nominal.


Some categories may have a natural relationship to each other, such as a natural ordering.

CONVERT CATEGORICAL DATA INTO NUMERICAL DATA

This involves two steps:
  1. Integer Encoding
  2. One-Hot Encoding

1. Integer Encoding
As a first step, each unique category value is assigned an integer value.

For example, “Apple” is 1, “Chicken” is 2, and “Broccoli” is 3.

This is called a label encoding or an integer encoding and is easily reversible.

For some variables, this may be enough.

The integer values have a natural ordered relationship between each other and machine learning algorithms may be able to understand and harness this relationship.



2. One-Hot Encoding
For categorical variables where no such ordinal relationship exists, the integer encoding is not enough.

In fact, using this encoding and allowing the model to assume a natural ordering between categories may result in poor performance or unexpected results (predictions halfway between categories).

In this case, a one-hot encoding can be applied to the integer representation. This is where the integer encoded variable is removed and a new binary variable is added for each unique integer value.


5 comments:

  1. It's very useful blog post with inforamtive and insightful content and i had good experience with this information. We, at the CRS info solutions ,help candidates in acquiring certificates, master interview questions, and prepare brilliant resumes.Find top Salesforce admin interview questions in 2020.
    These Salesforce developer interview questions are highly helpful in 2020. You can read these Salesforce lightning interview questions and Salesforce integration interview questions which are prepared by industry experts.

    ReplyDelete