Set Dtypes

dtypes

Configs.set_dtypes


Configs.set_dtypes( dataframe, dtypes_dict=None )

This function sets data types for variables in the provided DataFrame.

Parameters :

  • dataframe : pandas df
  • The DataFrame for which data types are to be set.

  • dtypes_dict : dict or str, optional
  • A dictionary containing variable names and their corresponding data types. If 'reset', resets data types to their original values obtained from ix.Configs.get_dtypes(). If None (default), data types are determined automatically using ix.Configs.get_dtypes().

Returns:

  • str :
  • A confirmation message indicating the operation is completed.

Examples :

The recommended approach is to copy the dictionary obtained from the get_dtypes function and paste it into this function. Then, make necessary edits to the variables based on your requirements.

By following this method, you can efficiently assign appropriate data types to the variables in your DataFrame, ensuring accurate data representation for subsequent analyses.

# To set data types for variables in the Titanic dataset based on a custom dictionary:

>>> ix.Configs.set_dtypes(titanic,
    ...     dtypes_dict = {
    ...         'PassengerId': 'continuous',
    ...         'Survived': 'categorical',
    ...         'Pclass': 'categorical',
    ...         'Name': 'text',
    ...         'Sex': 'categorical',
    ...         'Age': 'continuous',
    ...         'SibSp': 'categorical',
    ...         'Parch': 'categorical',
    ...         'Ticket': 'text',
    ...         'Fare': 'continuous',
    ...         'Cabin': 'text',
    ...         'Embarked': 'categorical'
    ...     })
...
'done'

When dtypes_dict is set to ‘reset’, it triggers the resetting of the data types cache to its initial state, effectively reverting the values to those initially retrieved by the get_dtypes function.

# To reset data types to their original values:
>>> ix.Configs.set_dtypes(titanic, dtypes_dict='reset')
...
'done'

When setting dtypes_dict to None (default), the function instructs to determine the data types automatically using the get_dtypes function, which executes automatically upon initializing a new dataframe for internal purposes only. This action will not reset the values to original. To reset the values, use dtypes_dict=’reset’ instead!

# To determine data types automatically:
>>> ix.Configs.set_dtypes(titanic)
...
'done'

TIP: Setting dtypes inside the ADIX environment will NOT change the data types of the pandas dataframe. They are used internally for a more precise description of the variables.

Setting Dtypes

set_dtypes(df, dtypes_dict)

The recommended approach is to copy the dictionary obtained from the get_dtypes function and paste it into this function. Then, make necessary edits to the variables based on your requirements.

By following this method, you can efficiently assign appropriate data types to the variables in your DataFrame, ensuring accurate data representation for subsequent analyses.

# setting the dtypes based on the dtypes_dict
ix.Configs.set_dtypes(titanic,
    dtypes_dict = {
     'PassengerId': 'continuous',
     'Survived': 'categorical',
     'Pclass': 'categorical',
     'Name': 'text',
     'Sex': 'categorical',
     'Age': 'continuous',
     'SibSp': 'categorical',
     'Parch': 'categorical',
     'Ticket': 'text',
     'Fare': 'continuous',
     'Cabin': 'text',
     'Embarked': 'categorical'
     })

‘done’

set_dtypes(df, dtypes_dict=’reset’)

When dtypes_dict is set to ‘reset’, it triggers the resetting of the data types cache to its initial state, effectively reverting the values to those initially retrieved by the get_dtypes function.

# Reset dtype values to their initial configurations retrieved from the get_dtypes function.
ix.Configs.set_dtypes(titanic, dtypes_dict='reset')

‘done’

set_dtypes(df, dtypes_dict=None)

When setting dtypes_dict to None (default), the function instructs to determine the data types automatically using the get_dtypes function, which executes automatically upon initializing a new dataframe for internal purposes only. This action will not reset the values to original. To reset the values, use dtypes_dict=’reset’ instead!

# initializing df with get_dtypes values
ix.Configs.set_dtypes(titanic)

‘done’