Strategies for data transformation include the following:
1. Smoothing, which works to remove noise from the data. Techniques include binning,regression, and clustering.
2. Attribute construction (or feature construction), where new attributes are constructed and added from the given set of attributes to help the mining process.
3. Aggregation, where summary or aggregation operations are applied to the data. For example, the daily sales data may be aggregated so as to compute monthly and annual total amounts.
4. Normalization, where the attribute data are scaled so as to fall within a smaller range,such as −1.0 to 1.0, or 0.0 to 1.0.
5. Discretization, where the raw values of a numeric attribute (e.g., age) are replaced by interval labels (e.g., 0–10, 11–20, etc.) or conceptual labels (e.g., youth, adult, senior)
6. Concept hierarchy generation for nominal data, where attributes such as street can be generalized to higher-level concepts, like city or country.