Skip to main content

Boosting

Boosting is the process of automatically cleaning, filling, and enriching your contact data. This ensures your information is accurate, complete, and much more useful. The process works through three core operations:

  • โœจ Enriching: Adds entirely new information that was missing. For example, creating and populating a Company Sector attribute based on a contact's email.
  • โœ๏ธ Filling: Completes missing values in existing attributes. For example, if a contact has a Name but no Surname, boosting can fill in the Surname from their email address.
  • ๐Ÿงผ Cleaning: Corrects or standardizes existing information. For example, transforming a Name from dott. john smith to John Smith.

When you invoke boosting, a series of specialized boosters run by default to improve your data.

How Boosting Worksโ€‹

By default, Datumo runs a suite of boosters on your contacts. Each booster is designed to enhance specific attributes by intelligently analyzing the data you already have. The default boosters are:

  • name_surname
  • human
  • language
  • business_language
  • gender (Gender Booster)
  • company (Company Booster)
  • fiscal_code (Fiscal Code Booster)
  • location (Location Booster)
Booster Name (id)Attributes AffectedHow it Works
Name & Surname (name_surname)Name, SurnameCleans typos/formatting, fills missing name parts (e.g., from email), and standardizes capitalization/diacritics.
Is Human (human)Is humanUses email domains, name patterns, job titles, and other signals to determine if the record is a person or a company.
Preferred Language (language)Preferred LanguageInfers language from names, email domains, and textual clues; fills and standardizes the attribute.
Business Language (business_language)Business LanguageInfers business communication language using company domain, industry, and corporate cues.
Gender (gender)GenderExtracts from Italian Fiscal Code, cross-references with name datasets, or uses ML model; cleans and fills when confident.
Company (company)Company - Name, Company - Website, Company - SectorsInfers company data from email domain; validates website; identifies sectors; fills and standardizes attributes.
Fiscal Code (fiscal_code)Fiscal CodeValidates format/rules; cleans and standardizes to uppercase if valid but misformatted.
Location (location)Birth Place (City, Province, Country), Current Place (City, Province, Country, Currency)Extracts birthplace from Fiscal Code; infers current location from phone prefix/other signals; fills currency when country is known.

Boosting with a Subset of Boostersโ€‹

You can choose to execute a specific subset of boosters by providing a list of their names in the boosters argument. This is useful for targeted enrichments and faster processing times. ๐Ÿš€

The available boosters are:

  • name_surname
  • human
  • language

Requesting a boostingโ€‹

To request a boosting, you need to have a collection with contacts data on Datumo. You can create a collection and upload your data following instructions in the Collection section.

Once you have uploaded your contacts data, you can request a boosting on your contacts by sending a POST request to the invocation endpoint, with boosting as invocationType. See more at Invoke Datumo.

The boosting produces also some reports and insights on boosted data as additional results; for more information have a look to Boosting Secondary Results.

Interpreting the boosting resultsโ€‹

The results of a boosting request will provide you with the cleaned data, with filled and enriched information on your contacts.

Exampleโ€‹

You request a boosting for the following contacts:

ID,Name,Surname,Gender,Email,Company - Name
0,Silvia,Marri,female,silvi.marri@snrt.co.eu,SN RTek
1,,,female,toninal@nicojd.com,NicoJds

The output, in csv format, will be:

ID,Name,Surname,Gender,Email,Company - Name,Is human,Company - Site
0,Silvi,Marri,female,silvi.marri@snrt.co.eu,SN RTek,True,www.snrt.co.eu
1,,Tonina,female,toninal@nicojd.com,NicoJds,True,www.nicojd.com

Deprecated Boostersโ€‹

The following column comes from boosters that are deprecated and will be removed in future versions:

  • Country
  • Income by Degree Of Urbanisation
  • Income by Birth Country
  • Income by Age and Gender
  • Birthday
  • Minimum family size
  • Generation
  • Age
  • Phone Number
  • Income by Household Type
  • Income by Educational Level
  • Maximum family size

These are only kept momentarily to ensure backward compatibility with existing data. They will be removed in future versions of Datumo.