FeaturizationConfig Class 
Defines feature engineering configuration for automated machine learning experiments in Azure Machine Learning.
Use the FeaturizationConfig class in the featurization parameter of the
AutoMLConfig class. For more information,
see Configure automated ML experiments.
Create a FeaturizationConfig.
Constructor
FeaturizationConfig(blocked_transformers: List[str] | None = None, column_purposes: Dict[str, str] | None = None, transformer_params: Dict[str, List[Tuple[List[str], Dict[str, Any]]]] | None = None, drop_columns: List[str] | None = None, dataset_language: str | None = None, prediction_transform_type: str | None = None)Parameters
| Name | Description | 
|---|---|
| blocked_transformers | A list of transformer names to be blocked during featurization. Default value: None | 
| column_purposes | A dictionary of column names and feature types used to update column purpose. Default value: None | 
| transformer_params | A dictionary of transformer and corresponding customization parameters. Default value: None | 
| drop_columns | A list of columns to be ignored in the featurization process. This setting is being deprecated. Please drop columns from your datasets as part of your data preparation process before providing the datasets to AutoML. Default value: None | 
| prediction_transform_type | A str of target transform type to be used to cast target column type. Default value: None | 
| blocked_transformers 
				Required
			 | A list of transformer names to be blocked during featurization. | 
| column_purposes 
				Required
			 | A dictionary of column names and feature types used to update column purpose. | 
| transformer_params 
				Required
			 | A dictionary of transformer and corresponding customization parameters. | 
| drop_columns 
				Required
			 | A list of columns to be ignored in the featurization process. This setting is being deprecated. Please drop columns from your datasets as part of your data preparation process before providing the datasets to AutoML. | 
| dataset_language | Three character ISO 639-3 code for the language(s) contained in the dataset. Languages other than English are only supported if you use GPU-enabled compute. The langugage_code 'mul' should be used if the dataset contains multiple languages. To find ISO 639-3 codes for different languages, please refer to https://en.wikipedia.org/wiki/List_of_ISO_639-3_codes. Default value: None | 
| prediction_transform_type 
				Required
			 | A str of target transform type to be used to cast target column type. | 
Remarks
Featurization customization has methods that allow you to:
- Add or remove column purpose. With the - add_column_purposeand- remove_column_purposemethods you can override the feature type for specified columns, for example, when the feature type of column does not correctly reflect its purpose. The add method supports adding all the feature types given in the FULL_SET attribute of the FeatureType class.
- Add or remove transformer parameters. With the - add_transformer_paramsand- remove_transformer_paramsmethods you can change the parameters of customizable transformers like Imputer, HashOneHotEncoder, and TfIdf. Customizable transformers are listed in the SupportedTransformers class CUSTOMIZABLE_TRANSFORMERS attribute. Use the- get_transformer_paramsto lookup customization parameters.
- Block transformers. Block transformers to be used for the featurization process with the - add_blocked_transformersmethod. The transformers must be one of the transformers listed in the SupportedTransformers class BLOCKED_TRANSFORMERS attribute.
- Add a drop column to ignore for featurization and training with the - add_drop_columnsmethod. For example, you can drop a column that doesn't contain useful information.
- Add or remove prediction transform type. With - add_prediction_transform_typeand
remove_prediction_transform_type methods you can override the existing target column type.
Prediction transform types are listed in the PredictionTransformTypes
attribute.
The following code example shows how to customize featurization in automated ML for forecasting. In the example code, dropping a column and adding transform parameters are shown.
   featurization_config = FeaturizationConfig()
   # Force the CPWVOL5 feature to be numeric type.
   featurization_config.add_column_purpose("CPWVOL5", "Numeric")
   # Fill missing values in the target column, Quantity, with zeros.
   featurization_config.add_transformer_params(
       "Imputer", ["Quantity"], {"strategy": "constant", "fill_value": 0}
   )
   # Fill missing values in the INCOME column with median value.
   featurization_config.add_transformer_params(
       "Imputer", ["INCOME"], {"strategy": "median"}
   )
   # Fill missing values in the Price column with forward fill (last value carried forward).
   featurization_config.add_transformer_params("Imputer", ["Price"], {"strategy": "ffill"})
Full sample is available from https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
The next example shows customizing featurization in a regression problem using the Hardware Performance Dataset. In the example code, a blocked transformer is defined, column purposes are added, and transformer parameters are added.
   featurization_config = FeaturizationConfig()
   featurization_config.blocked_transformers = ["LabelEncoder"]
   # featurization_config.drop_columns = ['MMIN']
   featurization_config.add_column_purpose("MYCT", "Numeric")
   featurization_config.add_column_purpose("VendorName", "CategoricalHash")
   # default strategy mean, add transformer param for for 3 columns
   featurization_config.add_transformer_params("Imputer", ["CACH"], {"strategy": "median"})
   featurization_config.add_transformer_params(
       "Imputer", ["CHMIN"], {"strategy": "median"}
   )
   featurization_config.add_transformer_params(
       "Imputer", ["PRP"], {"strategy": "most_frequent"}
   )
   # featurization_config.add_transformer_params('HashOneHotEncoder', [], {"number_of_bits": 3})
Full sample is available from https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/regression-explanation-featurization/auto-ml-regression-explanation-featurization.ipynb
The FeaturizationConfig defined in the code example above can then used in the configuration of an automated ML experiment as shown in the next code example.
   automl_settings = {
       "enable_early_stopping": True,
       "experiment_timeout_hours": 0.25,
       "max_concurrent_iterations": 4,
       "max_cores_per_iteration": -1,
       "n_cross_validations": 5,
       "primary_metric": "normalized_root_mean_squared_error",
       "verbosity": logging.INFO,
   }
   automl_config = AutoMLConfig(
       task="regression",
       debug_log="automl_errors.log",
       compute_target=compute_target,
       featurization=featurization_config,
       training_data=train_data,
       label_column_name=label,
       **automl_settings,
   )
Full sample is available from https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/regression-explanation-featurization/auto-ml-regression-explanation-featurization.ipynb
Methods
| add_blocked_transformers | Add transformers to be blocked. | 
| add_column_purpose | Add a feature type for the specified column. | 
| add_drop_columns | Add column name or list of column names to ignore. | 
| add_prediction_transform_type | Add a prediction transform type for target column. PredictionTransformTypes class. :type prediction_transform_type: str | 
| add_transformer_params | Add customized transformer parameters to the list of custom transformer parameters. Apply to all columns if column list is empty. | 
| get_transformer_params | Retrieve transformer customization parameters for columns. | 
| remove_column_purpose | Remove the feature type for the specified column. If no feature is specified for a column, the detected default feature is used. | 
| remove_prediction_transform_type | Revert the prediction transform type to default for target column. | 
| remove_transformer_params | Remove transformer customization parameters for specific column or all columns. | 
add_blocked_transformers
Add transformers to be blocked.
add_blocked_transformers(transformers: str | List[str]) -> NoneParameters
| Name | Description | 
|---|---|
| transformers 
				Required
			 | A transformer name or list of transformer names. Transformer names must be one of the transformers listed in the BLOCKED_TRANSFORMERS attribute of the SupportedTransformers class. | 
add_column_purpose
Add a feature type for the specified column.
add_column_purpose(column_name: str, feature_type: str) -> NoneParameters
| Name | Description | 
|---|---|
| column_name 
				Required
			 | A column name to update. | 
| feature_type 
				Required
			 | A feature type to use for the column. Feature types must be one given in the FULL_SET attribute of the FeatureType class. | 
add_drop_columns
add_prediction_transform_type
Add a prediction transform type for target column.
PredictionTransformTypes class. :type prediction_transform_type: str
add_prediction_transform_type(prediction_transform_type: str) -> NoneParameters
| Name | Description | 
|---|---|
| prediction_transform_type 
				Required
			 | A prediction transform type to be used for casting target column. Feature types must be one given in the FULL_SET attribute of the | 
add_transformer_params
Add customized transformer parameters to the list of custom transformer parameters.
Apply to all columns if column list is empty.
add_transformer_params(transformer: str, cols: List[str], params: Dict[str, Any]) -> NoneParameters
| Name | Description | 
|---|---|
| transformer 
				Required
			 | The transformer name. The transformer name must be one of the CUSTOMIZABLE_TRANSFORMERS listed in the SupportedTransformers class. | 
| cols 
				Required
			 | Input columns for specified transformer. Some transformers can take multiple columns as input specified as a list. | 
| params 
				Required
			 | A dictionary of keywords and arguments. | 
Remarks
The following code example shows how to customize featurization in automated ML for forecasting. In the example code, dropping a column and adding transform parameters are shown.
   featurization_config = FeaturizationConfig()
   # Force the CPWVOL5 feature to be numeric type.
   featurization_config.add_column_purpose("CPWVOL5", "Numeric")
   # Fill missing values in the target column, Quantity, with zeros.
   featurization_config.add_transformer_params(
       "Imputer", ["Quantity"], {"strategy": "constant", "fill_value": 0}
   )
   # Fill missing values in the INCOME column with median value.
   featurization_config.add_transformer_params(
       "Imputer", ["INCOME"], {"strategy": "median"}
   )
   # Fill missing values in the Price column with forward fill (last value carried forward).
   featurization_config.add_transformer_params("Imputer", ["Price"], {"strategy": "ffill"})
Full sample is available from https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/forecasting-orange-juice-sales/auto-ml-forecasting-orange-juice-sales.ipynb
get_transformer_params
Retrieve transformer customization parameters for columns.
get_transformer_params(transformer: str, cols: List[str]) -> Dict[str, Any]Parameters
| Name | Description | 
|---|---|
| transformer 
				Required
			 | The transformer name. The transformer name must be one of the CUSTOMIZABLE_TRANSFORMERS listed in the SupportedTransformers class. | 
| cols 
				Required
			 | The columns names to get information for. Use an empty list to specify all columns. | 
Returns
| Type | Description | 
|---|---|
| Transformer parameter settings. | 
remove_column_purpose
Remove the feature type for the specified column.
If no feature is specified for a column, the detected default feature is used.
remove_column_purpose(column_name: str) -> NoneParameters
| Name | Description | 
|---|---|
| column_name 
				Required
			 | The column name to update. | 
remove_prediction_transform_type
Revert the prediction transform type to default for target column.
remove_prediction_transform_type() -> Noneremove_transformer_params
Remove transformer customization parameters for specific column or all columns.
remove_transformer_params(transformer: str, cols: List[str] | None = None) -> NoneParameters
| Name | Description | 
|---|---|
| transformer 
				Required
			 | The transformer name. The transformer name must be one of the CUSTOMIZABLE_TRANSFORMERS listed in the SupportedTransformers class. | 
| cols | The columns names to remove customization parameters from. Specify None (the default) to remove all customization params for the specified transformer. Default value: None |