Unlocking Data Insights Through Arithmetic Feature Engineering - Arithmetic
Diving into the world of data, we find that numbers tell stories, and arithmetic is the language they speak. Arithmetic feature engineering is a fundamental step in preparing your dataset for machine learning, transforming raw data into insightful features that capture the underlying patterns and relationships.
What is Arithmetic Feature Engineering?
Arithmetic feature engineering involves the application of basic mathematical operations—such as addition, subtraction, multiplication, and division—to create new features that are more predictive or have more analytical value than the original raw data.
Why Apply Arithmetic in Feature Engineering?
Reveal Hidden Relationships: Simple operations can uncover relationships between variables that aren't immediately obvious.
Scale to Significance: Adjust variables to a comparable scale to highlight their importance.
Interaction Features: Create features that represent the interaction between two or more variables, which could be critical for certain models.
Data Normalization: Prepare data for algorithms that assume normally distributed features.
Examples of Arithmetic Feature Engineering
Ratios: Comparing two features, like "Income to Debt" ratio for credit scoring, to understand relative magnitude.
Differences: Measuring change, such as "Pre- and Post-Campaign Sales Difference" to assess the impact of a marketing campaign.
Summations: Adding up components, like "Total Spend per Customer" from individual transactions, to capture overall behavior.
Products: Multiplying features, like "Area" calculated from "Length x Width," for geometry-based insights in real estate analytics.
Applying Arithmetic Operations in Feature Flow
Feature Flow on our platform simplifies arithmetic feature engineering. You can:
Drag-and-Drop Arithmetic Functions: Select the data features you wish to transform and apply the desired arithmetic operations.
Preview Outcomes: Immediately see how the operations affect your data with live previews, ensuring you’re on the right track.
Customize Calculations: Use custom expressions to define complex arithmetic that matches your specific analytical needs.
Final Thoughts: By incorporating arithmetic operations into your feature engineering workflow, you're not just crunching numbers—you're crafting a narrative. With each operation, you're stepping closer to unlocking the predictive potential nestled within your dataset.
Tutorial: Conducting Arithmetic Feature Engineering on Your Dataset
Welcome to our step-by-step tutorial on performing arithmetic feature engineering using Octai. By the end of this guide, you'll be able to create new, meaningful features from your existing dataset with ease.
Step 1: Importing Your Dataset
Log in to the platform and create a new project. Let's call it E-Commerce.
Navigate to the 'Import' section.
Click on 'Add Data' and select the CSV file you wish to upload. In this tutorial, we will use Customer Data. You can simply download it on .csv format.
After you uploaded the data the platform, under the Overview tab, you will be able to see overview of the data in excel format. Under Summary Statistics tab, you can see the descriptive statistics.
Step 2: Navigating to the Mine Section
To do feature engineering, firstly we should move on to Mine section. Click on the 'Mine' tab in the toolbar.
Drag your data, which is customer_data for now, into the canvas. Now you will be able to work with this data.
To perform feature engineering, you need to connect your data with Feature Flow node. Drag and drop it into canvas and connect it with your data. Name it as 'Arithmetic Feature'. These namings are important for people who will examine our project later.
Click Arithmetic Feature node and open up its own canvas. Now you are ready to perform aritmetics!
Step 3: Applying Arithmetic Operation and Creating New Feature
On Arithmetic Feature canvas, choose the features you want to do arithmetic operations on. Drag them into canvas. For our case, we will choose Quantity and UnitPrice from our dataset. We will calculate TotalPurchase which is the product of Quantity and UnitPrice.
Under the feature preprocess functions you can find the Arithmetics function. Drag it into the canvas and connect with the features we selected.
Click Arithmetics node again and open the sidebar for adjustments. Choose the operation type as Multiplication, select the features as operands. We will call the new feature as TotalPurchase.
After you are done, click save. Then, you need to run the Arithmetics node. Right click on it and run. Ta da! You have a new feature.
Step 4: Reviewing and Saving Your Work
Save the flow not to lose any progress. Go back to main flow, run the Arithmetic Feature to see your new feature, TotalPurchase. You can review it under overview tab.
You can validate a few rows manually by multiplying 'Quantity' by 'UnitPrice' to confirm the results are as expected.
Once satisfied, save your new dataset by using 'Export' node. Drag it onto canvas, connect with Arithmetic Feature and name the new dataset. Save it and run the whole flow at once. Now you can see your file under generated files. You can also examine your file from Import section.
Congratulations! You've successfully performed arithmetic feature engineering on your dataset.