About Lesson
Join the conversation
Done
Reply
![](https://codanics.com/wp-content/uploads/2023/10/e24e3cc0-f9cd-49cc-9e5e-e7fea619bd42.jpg)
I learned One-hot Encoding, Label Encoding, and Ordinal Encoding and I have done practice with 100%
Reply
![](https://codanics.com/wp-content/uploads/2024/05/My_profile_pic.jpg)
I have done this lecture with 100% practice.
Reply
![](https://codanics.com/wp-content/uploads/2023/10/9dd24f5a-b137-440a-baba-1855925152a0.jpg)
AOA,
Feature Encoding in Data Preprocessing is done .
ALLAH KAREEM aap ko dono jahan ki bhalayean aata kry.AAMEEN
Reply
![](https://codanics.com/wp-content/uploads/2024/02/2.jpg)
feature encoding is also a part of feature engineering but is used for categorical variables only.
Reply
![](https://codanics.com/wp-content/uploads/2023/10/WhatsApp-Image-2023-10-03-at-12.33.51-PM.jpeg)
Done.Thanks codanics
Reply
A)- When using standard scaling in practice, it is common to scale the data within a range, such as [-3, 3]
Reply
A)- When using standard scaling in practice, it is common to scale the data within a range, such as [-1, 1] or [0, 1],
Reply
What is the range of stander scaling?
Reply
![](https://codanics.com/wp-content/uploads/2023/10/MUHAMMAD-NOMAN.jpg)
One-Hot encoding vs Label encoding
Both techniques turn categorical values into numbers.
But what is the difference then?
Let's discuss 👇Most ML algorithms will struggle with categorical data. To avoid this, we usually use One-Hot encoding or Label encoding.After the transformation process, we can train the models on numbers.One-Hot encodingThis technique creates a new feature for every unique categorical value.If we have a dataset with 3 colors, one hot encoding will create a new dataset with 3 new features.That can lead to issues as well because for too many categories the dimensionality will increase rapidly.For that reason, One-Hot encoding is better for data, where the number of categories is not large.Note: By default One-Hot encoding usually uses K dummies for K categories. But that is not effective and can lead to issues. K-1 variable is enough, but more on this in another post.Label encodingThis technique replaces each unique categorical value with a consecutive number.For the same example dataset we will not have 3 new features, only 1.So computationally it is more effective, but it still has drawbacks.For example, the consecutive numbers can lead to a false impression about ranks between the values.If Red is two and Green is 1, one could interpret it as Red > Green.So which encoding technique to use?It depends on the dataset or the model you want to use.Use One-Hot encoding for not ordinal categories and less features.Use Label encoding with ordinal data, or where the number of categories is large
Reply