Data Schemas
Lead Scoring leverages Deals historical data. In this section the data schema defining the required columns is provided. All extra columns in the csv file will be ignored.
Deals
Each row in the csv file is related to a single deal. Fields whose name begins with company contain information relating to the company linked to the deal. Fields whose name begins with contact contain information about the person who is the main contact for the deal.
| Column name | Description | Format details | Nullable | Example |
|---|---|---|---|---|
| id | Unique identifier for the deal. | String | False | 1 |
| name | Name of the deal. | String | True | Deal with Company X |
| description | Description of the deal. | String | True | Partnership agreement |
| status | Indicates the status of the deal. Should be one of: closed_won, closed_lost, open. | String | False | closed_won |
| stage | Current stage of the deal. | String | False | Negotiation |
| currentStageStartDatetime | Datetime of entrance in the current deal stage. | yyyy-mm-ddThh:MM:ss | False | 2024-06-10T09:15:30 |
| value | Value associated with the deal. | Positive Decimal | False | 1000.00 |
| lastUpdateDatetime | Datetime of the deal last recorded change. | yyyy-mm-ddThh:MM:ss | False | 2024-06-18T12:30:45 |
| numberOfSentEmails | Number of emails sent to the contact regarding the deal. | Positive Integer | True | 5 |
| numberOfEmailOpens | Number of times deal-related emails were opened by the contact. | Positive Integer | True | 20 |
| numberOfReceivedEmails | Number of emails received from the contact regarding the deal. | Positive Integer | True | 5 |
| numberOfMeetings | Number of meetings related to the deal. | Positive Integer | True | 10 |
| lastInteractionDatetime | Datetime of last interaction (email, meeting, ..) regarding the deal. | yyyy-mm-ddThh:MM:ss | True | 2024-06-18T12:30:45 |
| creationDatetime | Refers to the datetime when this specific deal was initiated (or recorded in the system). | yyyy-mm-ddThh:MM:ss | False | 2024-05-01T08:00:00 |
| accountId | Unique identifier for the account associated the deal. | String | False | x |
| accountAverageResponseTime | The average response time of the account, in hours rounded to the nearest integer. | Positive Integer | True | 5 |
| companyId | Unique identifier for the associated company. | String | False | 123 |
| companyNumberOfEmployees | Number of employees in the company. | Positive Integer | True | 50 |
| companyRevenue | Revenue of the company. | Positive Decimal | True | 1000060.50 |
| companyIndustry | Industry in which the company operates. | String | True | Technology |
| companyType | Indicates the type of the company. Should be one of: public (if the company is publicly traded), private (if not publicly traded), public_administration. | String | True | private |
| companyDescription | Description about the company. | String | True | Leading AI solutions... |
| companyTechnologies | Semi-colon separated list of technologies used by the company. | String | True | AI;Machine Learning;web |
| companyCity | City where the company is located. | String | True | Oslo |
| companyCountry | Country where the company is located. | String | True | Norway |
| companyIsCustomer | True if the company referring to the deal is a customer (has at least one deal closed won). | Boolean | False | true |
| contactId | Unique identifier for the main contact person. | String | False | 456 |
| contactNumberOfConversions | Number of previous deals won with the contact. | Positive Integer | False | 5 |
| contactEmailDomainType | Indicates the contact's email type domain. Should be one of: free_provider (if the domain is provided by a free email hosting service), company (if not). | String | True | company |
| contactEmailOptedOut | Indicates whether the contact has opted out of receiving communications via email. | Boolean | True | true |
| contactAverageResponseTime | The average response time of the contact, in hours rounded to the nearest integer. | Positive Integer | True | 5 |
| lastSyncedAt | Datetime of the dataset was exported from the source CRM | yyyy-mm-ddThh:MM:ss | False | 2024-06-18T12:30:45 |
The lastSyncedAt and lastUpdateDatetime field are crucial for the Lead Scoring model to work correctly. The lastSyncedAt field should be updated every time the data is updated in the system. The lastUpdateDatetime field should be updated every
time a property of the deal is updated. These fields are very significant to properly represent the data and its evolution in time. It is important to keep these fields up to date not only for the data collection to work correctly but also to ensure
the model performs well.
All string fields should follow the following rules:
- all text should be enclosed in double quotes character
"; - text can contain new lines;
- if the text contains a character
", you should escape it by doubling it. Example: raw textClient said: "it is a great idea!"becomesClient said: ""it is a great idea!"".
The following values are accepted for boolean fields:
trueorTrueorTRUEor1for true;falseorFalseorFALSEor0for false.
Adding notes
Deals data can be enhanced by adding notes taken regarding the deal. Lead Scoring exploits all notes, as long as they are contained in columns with a name like note*, where * is any number. All notes columns can be nullable.
Note columns are a string type and should follow the same rules as the other string fields.
The presence of notes significantly increases Lead Scoring model performances: it is highly recommended to include all available notes in the csv.
CSV file example
id,name,description,status,stage,currentStageStartDatetime,value,lastUpdateDatetime,numberOfSentEmails,numberOfEmailOpens,numberOfReceivedEmails,numberOfMeetings,lastInteractionDatetime,creationDatetime,accountId,accountAverageResponseTime,companyId,companyNumberOfEmployees,companyRevenue,companyIndustry,companyType,companyDescription,companyTechnologies,companyCity,companyCountry,companyIsCustomer,contactId,contactNumberOfConversions,contactEmailDomainType,contactEmailOptedOut,contactAverageResponseTime,note1,lastSyncedAt
1,"Deal with Company X",Partnership agreement,closed_won,Negotiation,2024-06-10T09:15:30,1000.00,2024-06-18T12:30:45,5,20,5,10,2024-06-18T12:30:45,2024-05-01T08:00:00,x,5,123,50,1000060.50,Technology,private,"Leading AI solutions...",AI;Machine Learning;web,Oslo,Norway,true,456,5,company,true,5,,2024-06-18T12:30:45
2,"Deal with Company Y",Consulting agreement,closed_lost,Proposal,2024-03-15T14:20:00,5000.00,2024-04-22T09:45:15,10,30,8,5,2024-04-22T09:45:15,2024-02-01T11:30:00,y,3,456,100,5000000.00,Finance,public,"Global financial services...",finance;banking;compliance,London,United Kingdom,false,789,2,company,false,2,"X said: ""seems a great idea!""",2024-06-18T12:30:45
3,"Deal with Company Z",Reseller partnership,open,Negotiation,2024-07-01T08:00:00,50000.00,2024-07-15T16:00:00,3,15,2,20,2024-07-15T16:00:00,2024-06-01T09:00:00,z,7,789,25,500000.00,Retail,private,"Online retail platform...",e-commerce;web;mobile,Berlin,Germany,false,321,1,company,true,4,"Y said: ""seems a great idea!""",2024-06-18T12:30:45
4,"Deal with Company A",OEM agreement,closed_won,Closed,2024-05-20T11:30:00,25000.00,2024-06-05T14:00:00,8,25,6,15,2024-06-05T14:00:00,2024-04-01T10:00:00,a,4,159,75,2500000.00,Manufacturing,public,"Industrial automation...",manufacturing;robotics;IoT,Tokyo,Japan,true,654,3,company,false,3,"seems interested",2024-06-18T12:30:45