Join the conversation
1. Ordinal Encoding
Use Case: Categorical variables with inherent order or ranking.
Example: ["Low", "Medium", "High"] could be encoded as [1, 2, 3].
2. One-Hot Encoding
Use Case: Nominal categorical variables with no inherent order.
Example: ["Red", "Blue", "Green"] could be encoded as three separate binary columns: Red (1, 0, 0), Blue (0, 1, 0), Green (0, 0, 1).
3. Binary Encoding
Use Case: High-cardinality nominal categorical variables.
Example: "Category 15" could be encoded to binary and then split into separate columns.
4. Label Encoding
Use Case: Categorical variables with a meaningful ordinal relationship.
Example: ["First", "Second", "Third"] could be encoded as [1, 2, 3].
5. Count Encoding
Use Case: When the frequency of occurrences of a category is relevant.
Example: A category that appears 10 times in the dataset would be encoded as 10.
6. Target Encoding / Mean Encoding
Use Case: When the relationship between the categorical variable and the target variable is important.
Example: Encoding categories based on the mean of the target variable for each category.
7. Frequency Encoding
Use Case: When the frequency of categories is relevant.
Example: A category appearing 5% of the time would be encoded as 0.05.
8. Feature Hashing
Use Case: Dealing with high-cardinality categorical features to reduce dimensionality.
Example: Hashing each category into a fixed number of columns.
9. Embedding Layers
Use Case: Embedding layers in neural networks for categorical variables.
Example: Mapping each category to a dense vector representation within the network.
10. Entity Embeddings of Categorical Variables
Use Case: Learning dense representations of categorical variables in deep learning scenarios.
Example: Similar to embedding layers, used to capture relationships between categories in a low-dimensional space.
Brief Descriptions:
A. Ordinal Encoding: Used for categorical variables with inherent order or ranking.
B. One-Hot Encoding: Used for nominal categorical variables without inherent order.
C. Binary Encoding: Used with high-cardinality nominal categorical variables.
D. Label Encoding: Used when the ordinal relationship between categories is known and meaningful.
E. Count Encoding: Used when the frequency of occurrences of a category is relevant information.
F. Target Encoding / Mean Encoding: Used when the relationship between the categorical variable and the target variable is important.
G. Frequency Encoding: Used when the frequency of categories is relevant.
H. Feature Hashing: Used when dealing with high-cardinality categorical features to reduce dimensionality.
J. Embedding Layers: Used for embedding layers in neural networks for categorical variables.
K. Entity Embeddings of Categorical Variables: Useful in deep learning scenarios for learning dense representations of categorical variables.
These encoding methods help transform categorical data into numerical formats suitable for machine learning models.
Reply
![](https://codanics.com/wp-content/uploads/2024/04/IMG_16954467347071342.jpg)
Done
Reply
![](https://codanics.com/wp-content/uploads/2024/04/IMG_20240410_175137-1-scaled.jpg)
Done
Reply
1. Label Encoding: Assigns unique label to each category, used for ordinal data where the order matters.
2. On-Hot Encoding: Creates binary columns for each category, indicating the presence or absence. Best for nominal data and works well when the number of categories is not too high.
3. Ordinal Encoding: Assigns numerical values based on the order. Useful for ordinal data when we have a clear order among categories.
4. Binary Encoding: Converts categories into binary code. Efficient when dealing with high cardinality categorical features.
5. Frequency Encoding: Uses the frequency of each category as its representation, works when categories with higher frequencies might carry more significance.
6. Target Encoding: Involves replacing a categorical value with the mean of the target variable for that category. Useful when we want to incorporate target variable information into the encoding. It is effective for improving model performance especially in classification tasks.
Reply
You can use df.sample(5) for taking different data points from data.
Reply
Mahboob ul-Hassan
Assignment:
Assignment:
Types of feature encoding:
1- Ordinal Encoding
2- One-Hot Encoding
3- Binary Encoding
4- Label Encoding
5- Count Encoding
6- Target Encoding or Mean Encoding
7- Frequency Encoding
8- Feature Hashing
9- Embedding Layers
10-Entity Embeddings of Categorical Variables
A- Ordinal Encoding is used for categorical variables which have an inherent order or ranking
B- One-Hot Encoding is used for nominal categorical variables i.e. categories with no inherent order.
C- Binary Encoding is used with high-cardinality nominal categorical variables. D- Label Encoding is used when the ordinal relationship between categories is known and meaningful.
E- Count Encoding is used when frequency of occurrences of a category is relevant information.
F- Target Encoding /Mean Encoding is used when the relationship between the categorical variable and the target variable is important.
G- Frequency Encoding is used when the frequency of categories is relevant.
H-Feature Hashing is used when dealing with high-cardinality categorical features to reduce dimensionality.
J- Embedding Layers is used for embedding layers when working with categorical variables in neural networks.
K-Entity Embeddings of Categorical Variables seful in deep learning scenarios for learning dense representations of categorical variables.
Reply
![](https://codanics.com/wp-content/uploads/2023/10/e24e3cc0-f9cd-49cc-9e5e-e7fea619bd42.jpg)
I Have done this video with 100% practice and
Assignment: Q1. How many types of feature encoding are there? Feature encoding is a crucial step in the process of preparing data for machine learning models. Ordinal Encoding: One-Hot Encoding: Binary Encoding: Label Encoding: Count Encoding: Target Encoding (Mean Encoding): Frequency Encoding: Feature Hashing: Embedding Layers: Entity Embeddings of Categorical Variables: Q2. When to use which type of feature encoding? Ordinal Encoding: Use when the categorical variable has an inherent order or ranking. One-Hot Encoding: Suitable for nominal categorical variables (categories with no inherent order). Binary Encoding: When dealing with high-cardinality nominal categorical variables. Label Encoding: Suitable when the ordinal relationship between categories is known and meaningful. Count Encoding: When the frequency of occurrences of a category is relevant information. Target Encoding (Mean Encoding): When the relationship between the categorical variable and the target variable is important. Frequency Encoding: Similar to count encoding, it can be used when the frequency of categories is relevant. Feature Hashing: Useful when dealing with high-cardinality categorical features to reduce dimensionality. Embedding Layers: In the context of deep learning, use embedding layers when working with categorical variables in neural networks. Entity Embeddings of Categorical Variables: Similar to embedding layers, useful in deep learning scenarios for learning dense representations of categorical variables.
Reply
![](https://codanics.com/wp-content/uploads/2024/05/My_profile_pic.jpg)
I have done this lecture with 100% practice.
Reply
![](https://codanics.com/wp-content/uploads/2024/05/My_profile_pic.jpg)
Assignment: Q1. How many types of feature encoding are there?
Feature encoding is a crucial step in the process of preparing data for machine learning models.
Ordinal Encoding:
One-Hot Encoding:
Binary Encoding:
Label Encoding:
Count Encoding:
Target Encoding (Mean Encoding):
Frequency Encoding:
Feature Hashing:
Embedding Layers:
Entity Embeddings of Categorical Variables:Q2. When to use which type of feature encoding?
Ordinal Encoding:
Use when the categorical variable has an inherent order or ranking.
One-Hot Encoding:
Suitable for nominal categorical variables (categories with no inherent order).
Binary Encoding:
When dealing with high-cardinality nominal categorical variables.
Label Encoding:
Suitable when the ordinal relationship between categories is known and meaningful.
Count Encoding:
When the frequency of occurrences of a category is relevant information.
Target Encoding (Mean Encoding):
When the relationship between the categorical variable and the target variable is important.
Frequency Encoding:
Similar to count encoding, it can be used when the frequency of categories is relevant.
Feature Hashing:
Useful when dealing with high-cardinality categorical features to reduce dimensionality.
Embedding Layers:
In the context of deep learning, use embedding layers when working with categorical variables in neural networks.
Entity Embeddings of Categorical Variables:
Similar to embedding layers, useful in deep learning scenarios for learning dense representations of categorical variables.
![](https://codanics.com/wp-content/uploads/2023/10/9dd24f5a-b137-440a-baba-1855925152a0.jpg)
Assignment of Day-73: 13-dec-2023 ML (day-6)
How many types of feature encoding are there, and when to use which type of feature encoding?
A list of common feature encoding techniques and there uses:1- One-Hot Encoding (One-hot encoding is suitable for algorithms that can handle high-dimensional input.)2- Label Encoding ( Use label encoding when preserving the relative order of the categories is important.)3- Ordinal Encoding ( ordinal encoding is used when there is an ordinal relationship between categories.)4- Binary Encoding ( Binary encoding is useful for reducing the dimensionality of high-cardinality categorical variables.)5-Count Encoding ( It replaces each category with the count of occurrences of that category in the dataset.)6- Target Encoding ( Use target encoding when the relationship between a categorical variable and the target variable is important.)7- Feature Hashing ( Feature hashing is suitable for high-dimensional categorical variables with high cardinality.)8- Embedding ( They capture semantic relationships between words or entities in a continuous vector space )
Reply
![](https://codanics.com/wp-content/uploads/2023/10/9dd24f5a-b137-440a-baba-1855925152a0.jpg)
AOA, I learned in this lecture about the ML algorithm of LINER REGRESSION and its types in Python, which are1-Label Encoding
2-One Hot Encoding
3-Ordinal Encoding
4-Binary Encoding
ALLAH PAK aap ko sahat o aafiat wali lambi umar ata kray aor ap ko dono jahan ki bhalian naseeb farmaey, Ameen.
Reply