Boosting
Boosting is the process of automatically cleaning, filling, and enriching your contact data. This ensures your information is accurate, complete, and much more useful. The process works through three core operations:
- โจ Enriching: Adds entirely new information that was missing. For example, creating and populating a
Company Sectorattribute based on a contact's email. - โ๏ธ Filling: Completes missing values in existing attributes. For example, if a contact has a
Namebut noSurname, boosting can fill in theSurnamefrom their email address. - ๐งผ Cleaning: Corrects or standardizes existing information. For example, transforming a
Namefromdott. john smithtoJohn Smith.
When you invoke boosting, a series of specialized boosters run by default to improve your data.
How Boosting Worksโ
By default, Datumo runs a suite of boosters on your contacts. Each booster is designed to enhance specific attributes by intelligently analyzing the data you already have. The default boosters are:
name_surnamehumanlanguagebusiness_languagegender(Gender Booster)company(Company Booster)fiscal_code(Fiscal Code Booster)location(Location Booster)
Booster Name (id) | Attributes Affected | How it Works |
|---|---|---|
Name & Surname (name_surname) | Name, Surname | Cleans typos/formatting, fills missing name parts (e.g., from email), and standardizes capitalization/diacritics. |
Is Human (human) | Is human | Uses email domains, name patterns, job titles, and other signals to determine if the record is a person or a company. |
Preferred Language (language) | Preferred Language | Infers language from names, email domains, and textual clues; fills and standardizes the attribute. |
Business Language (business_language) | Business Language | Infers business communication language using company domain, industry, and corporate cues. |
Gender (gender) | Gender | Extracts from Italian Fiscal Code, cross-references with name datasets, or uses ML model; cleans and fills when confident. |
Company (company) | Company - Name, Company - Website, Company - Sectors | Infers company data from email domain; validates website; identifies sectors; fills and standardizes attributes. |
Fiscal Code (fiscal_code) | Fiscal Code | Validates format/rules; cleans and standardizes to uppercase if valid but misformatted. |
Location (location) | Birth Place (City, Province, Country), Current Place (City, Province, Country, Currency) | Extracts birthplace from Fiscal Code; infers current location from phone prefix/other signals; fills currency when country is known. |
Boosting with a Subset of Boostersโ
You can choose to execute a specific subset of boosters by providing a list of their names in the boosters argument. This is useful for targeted enrichments and faster processing times. ๐
The available boosters are:
name_surnamehumanlanguage
Requesting a boostingโ
To request a boosting, you need to have a collection with contacts data on Datumo. You can create a collection and upload your data following instructions in the Collection section.
Once you have uploaded your contacts data, you can request a boosting on your contacts by sending a POST request to the invocation endpoint, with boosting as invocationType.
See more at Invoke Datumo.
The boosting produces also some reports and insights on boosted data as additional results; for more information have a look to Boosting Secondary Results.
Interpreting the boosting resultsโ
The results of a boosting request will provide you with the cleaned data, with filled and enriched information on your contacts.
Exampleโ
You request a boosting for the following contacts:
ID,Name,Surname,Gender,Email,Company - Name
0,Silvia,Marri,female,silvi.marri@snrt.co.eu,SN RTek
1,,,female,toninal@nicojd.com,NicoJds
The output, in csv format, will be:
ID,Name,Surname,Gender,Email,Company - Name,Is human,Company - Site
0,Silvi,Marri,female,silvi.marri@snrt.co.eu,SN RTek,True,www.snrt.co.eu
1,,Tonina,female,toninal@nicojd.com,NicoJds,True,www.nicojd.com
Deprecated Boostersโ
The following column comes from boosters that are deprecated and will be removed in future versions:
CountryIncome by Degree Of UrbanisationIncome by Birth CountryIncome by Age and GenderBirthdayMinimum family sizeGenerationAgePhone NumberIncome by Household TypeIncome by Educational LevelMaximum family size
These are only kept momentarily to ensure backward compatibility with existing data. They will be removed in future versions of Datumo.