Skip to main content

Data Schemas

Lead Scoring leverages Deals historical data. In this section the data schema defining the required columns is provided. All extra columns in the csv file will be ignored.

Deals

Each row in the csv file is related to a single deal. Fields whose name begins with company contain information relating to the company linked to the deal. Fields whose name begins with contact contain information about the person who is the main contact for the deal.

Column nameDescriptionFormat detailsNullableExample
idUnique identifier for the deal.StringFalse1
nameName of the deal.StringTrueDeal with Company X
descriptionDescription of the deal.StringTruePartnership agreement
statusIndicates the status of the deal. Should be one of: closed_won, closed_lost, open.StringFalseclosed_won
stageCurrent stage of the deal.StringFalseNegotiation
currentStageStartDatetimeDatetime of entrance in the current deal stage.yyyy-mm-ddThh:MM:ssFalse2024-06-10T09:15:30
valueValue associated with the deal.Positive DecimalFalse1000.00
lastUpdateDatetimeDatetime of the deal last recorded change.yyyy-mm-ddThh:MM:ssFalse2024-06-18T12:30:45
numberOfSentEmailsNumber of emails sent to the contact regarding the deal.Positive IntegerTrue5
numberOfEmailOpensNumber of times deal-related emails were opened by the contact.Positive IntegerTrue20
numberOfReceivedEmailsNumber of emails received from the contact regarding the deal.Positive IntegerTrue5
numberOfMeetingsNumber of meetings related to the deal.Positive IntegerTrue10
lastInteractionDatetimeDatetime of last interaction (email, meeting, ..) regarding the deal.yyyy-mm-ddThh:MM:ssTrue2024-06-18T12:30:45
creationDatetimeRefers to the datetime when this specific deal was initiated (or recorded in the system).yyyy-mm-ddThh:MM:ssFalse2024-05-01T08:00:00
accountIdUnique identifier for the account associated the deal.StringFalsex
accountAverageResponseTimeThe average response time of the account, in hours rounded to the nearest integer.Positive IntegerTrue5
companyIdUnique identifier for the associated company.StringFalse123
companyNumberOfEmployeesNumber of employees in the company.Positive IntegerTrue50
companyRevenueRevenue of the company.Positive DecimalTrue1000060.50
companyIndustryIndustry in which the company operates.StringTrueTechnology
companyTypeIndicates the type of the company. Should be one of: public (if the company is publicly traded), private (if not publicly traded), public_administration.StringTrueprivate
companyDescriptionDescription about the company.StringTrueLeading AI solutions...
companyTechnologiesSemi-colon separated list of technologies used by the company.StringTrueAI;Machine Learning;web
companyCityCity where the company is located.StringTrueOslo
companyCountryCountry where the company is located.StringTrueNorway
companyIsCustomerTrue if the company referring to the deal is a customer (has at least one deal closed won).BooleanFalsetrue
contactIdUnique identifier for the main contact person.StringFalse456
contactNumberOfConversionsNumber of previous deals won with the contact.Positive IntegerFalse5
contactEmailDomainTypeIndicates the contact's email type domain. Should be one of: free_provider (if the domain is provided by a free email hosting service), company (if not).StringTruecompany
contactEmailOptedOutIndicates whether the contact has opted out of receiving communications via email.BooleanTruetrue
contactAverageResponseTimeThe average response time of the contact, in hours rounded to the nearest integer.Positive IntegerTrue5
lastSyncedAtDatetime of the dataset was exported from the source CRMyyyy-mm-ddThh:MM:ssFalse2024-06-18T12:30:45
lastSyncedAt importance

The lastSyncedAt and lastUpdateDatetime field are crucial for the Lead Scoring model to work correctly. The lastSyncedAt field should be updated every time the data is updated in the system. The lastUpdateDatetime field should be updated every time a property of the deal is updated. These fields are very significant to properly represent the data and its evolution in time. It is important to keep these fields up to date not only for the data collection to work correctly but also to ensure the model performs well.

String format

All string fields should follow the following rules:

  • all text should be enclosed in double quotes character ";
  • text can contain new lines;
  • if the text contains a character ", you should escape it by doubling it. Example: raw text Client said: "it is a great idea!" becomes Client said: ""it is a great idea!"".
Boolean format

The following values are accepted for boolean fields:

  • true or True or TRUE or 1 for true;
  • false or False or FALSE or 0 for false.

Adding notes

Deals data can be enhanced by adding notes taken regarding the deal. Lead Scoring exploits all notes, as long as they are contained in columns with a name like note*, where * is any number. All notes columns can be nullable. Note columns are a string type and should follow the same rules as the other string fields.

The presence of notes significantly increases Lead Scoring model performances: it is highly recommended to include all available notes in the csv.

CSV file example

id,name,description,status,stage,currentStageStartDatetime,value,lastUpdateDatetime,numberOfSentEmails,numberOfEmailOpens,numberOfReceivedEmails,numberOfMeetings,lastInteractionDatetime,creationDatetime,accountId,accountAverageResponseTime,companyId,companyNumberOfEmployees,companyRevenue,companyIndustry,companyType,companyDescription,companyTechnologies,companyCity,companyCountry,companyIsCustomer,contactId,contactNumberOfConversions,contactEmailDomainType,contactEmailOptedOut,contactAverageResponseTime,note1,lastSyncedAt
1,"Deal with Company X",Partnership agreement,closed_won,Negotiation,2024-06-10T09:15:30,1000.00,2024-06-18T12:30:45,5,20,5,10,2024-06-18T12:30:45,2024-05-01T08:00:00,x,5,123,50,1000060.50,Technology,private,"Leading AI solutions...",AI;Machine Learning;web,Oslo,Norway,true,456,5,company,true,5,,2024-06-18T12:30:45
2,"Deal with Company Y",Consulting agreement,closed_lost,Proposal,2024-03-15T14:20:00,5000.00,2024-04-22T09:45:15,10,30,8,5,2024-04-22T09:45:15,2024-02-01T11:30:00,y,3,456,100,5000000.00,Finance,public,"Global financial services...",finance;banking;compliance,London,United Kingdom,false,789,2,company,false,2,"X said: ""seems a great idea!""",2024-06-18T12:30:45
3,"Deal with Company Z",Reseller partnership,open,Negotiation,2024-07-01T08:00:00,50000.00,2024-07-15T16:00:00,3,15,2,20,2024-07-15T16:00:00,2024-06-01T09:00:00,z,7,789,25,500000.00,Retail,private,"Online retail platform...",e-commerce;web;mobile,Berlin,Germany,false,321,1,company,true,4,"Y said: ""seems a great idea!""",2024-06-18T12:30:45
4,"Deal with Company A",OEM agreement,closed_won,Closed,2024-05-20T11:30:00,25000.00,2024-06-05T14:00:00,8,25,6,15,2024-06-05T14:00:00,2024-04-01T10:00:00,a,4,159,75,2500000.00,Manufacturing,public,"Industrial automation...",manufacturing;robotics;IoT,Tokyo,Japan,true,654,3,company,false,3,"seems interested",2024-06-18T12:30:45