Data Schemas
Sales Forecasting leverages historical data to forecast the revenue from won deals. In this section, we provide the schema for the data required.
Sales Forecasting requires data of type deals, containing information about the deals in your CRM system. Sales Forecasting predicts the monthly revenue based on the historical data provided in the deals, with an horizon of 12 months. If you have more than one pipeline in your CRM, and you want to retrieve separate predictions for each pipeline, you need to create a separate collection for each pipeline.
Deals data
In the following table, you can find the schema for the subscription data. Each row in the csv file is related to a single deal.
The data must be uploaded via a unique csv file, and must have all the columns listed below. You can add any extra columns, but they will be ignored.
| Column name | Description | Format details | Nullable | Example |
|---|---|---|---|---|
| id | Unique identifier for the deal. | String | False | 1 |
| name | Name of the deal. | String | True | Customer X - annual supply |
| description | Description of the deal. | String | True | Partnership agreement |
| creationDatetime | Refers to the datetime when this specific deal was initiated. | yyyy-mm-ddThh:MM:ss | False | 2024-05-01T08:00:00 |
| addDatetime | Refers to the datetime when this specific deal was added to the CRM (firstly recorded). It should be >= creationDatetime. | yyyy-mm-ddThh:MM:ss | False | 2024-05-01T08:00:00 |
| closedDatetime | Refers to the datetime when this specific deal was closed (won or lost). Can be null only if the deal status is open. | yyyy-mm-ddThh:MM:ss | True | 2024-06-10T09:15:30 |
| expectedClosureDatetime | Refers to the estimated closure datetime for the deal. | yyyy-mm-ddThh:MM:ss | True | 2024-06-10T09:15:30 |
| status | Indicates the status of the deal. Should be one of: closed_won, closed_lost, open. | String | False | closed_won |
| stage | Current stage of the deal. | String | False | Negotiation |
| currentStageStartDatetime | Datetime of entrance in the current deal stage. | yyyy-mm-ddThh:MM:ss | False | 2024-06-10T09:15:30 |
| value | Value associated with the deal. | Positive Decimal | False | 1000.00 |
| numberOfSentEmails | Number of emails sent to the contact regarding the deal. | Positive Integer | True | 5 |
| numberOfEmailOpens | Number of times deal-related emails were opened by the contact. | Positive Integer | True | 20 |
| numberOfReceivedEmails | Number of emails received from the contact regarding the deal. | Positive Integer | True | 5 |
| numberOfMeetings | Number of meetings related to the deal. | Positive Integer | True | 10 |
| lastInteractionDatetime | Datetime of last interaction (email, meeting, ..) regarding the deal. | yyyy-mm-ddThh:MM:ss | True | 2024-06-18T12:30:45 |
| accountId | Unique identifier for the account associated the deal. | String | False | x |
| accountAverageResponseTime | The average response time of the account, in hours rounded to the nearest integer. | Positive Integer | True | 5 |
| companyId | Unique identifier for the associated company. | String | False | 123 |
| companyNumberOfEmployees | Number of employees in the company. | Positive Integer | True | 50 |
| companyRevenue | Revenue of the company. | Positive Decimal | True | 1000060.50 |
| companyIndustry | Industry in which the company operates. | String | True | Technology |
| companyType | Indicates the type of the company. Should be one of: public (if the company is publicly traded), private (if not publicly traded), public_administration. | String | True | private |
| companyDescription | Description about the company. | String | True | Leading AI solutions... |
| companyTechnologies | Semi-colon separated list of technologies used by the company. | String | True | AI;Machine Learning;web |
| companyCity | City where the company is located. | String | True | Oslo |
| companyCountry | Country where the company is located. | String | True | Norway |
| companyIsCustomer | True if the company referring to the deal is a customer (has at least one deal closed won). | Boolean | False | true |
| contactId | Unique identifier for the main contact person. | String | False | 456 |
| contactNumberOfConversions | Number of previous deals won with the contact. | Positive Integer | False | 5 |
| contactEmailDomainType | Indicates the contact's email type domain. Should be one of: free_provider (if the domain is provided by a free email hosting service), company (if not). | String | True | company |
| contactEmailOptedOut | Indicates whether the contact has opted out of receiving communications via email. | Boolean | True | true |
| contactAverageResponseTime | The average response time of the contact, in hours rounded to the nearest integer. | Positive Integer | True | 5 |
| lastUpdateDatetime | Datetime of the deal last recorded change. | yyyy-mm-ddThh:MM:ss | False | 2024-06-18T12:30:45 |
| lastSyncedAt | Datetime of the dataset was exported from the source CRM. | yyyy-mm-ddThh:MM:ss | False | 2024-06-18T12:30:45 |
Although expectedClosureDatetime is nullable, it is recommended to provide this information: some Sales Forecasting models can only work if this field is filled in for each deal.
If you choose not to provide this information, Sales Forecasting will still work, but forecast accuracy may be compromised because some models cannot be used.
Guidelines
To ensure the best results, follow these guidelines when preparing your data:
- Provide at least 24 months of historical data: Sales Forecasting predicts the monthly revenue based on the historical data provided in the deals, with a horizon of 12 months. For the best results, provide at least 24 months of historical data in order to let Sales Forecasting learn from your data.
- Keep the data up-to-date: to ensure accurate predictions, keep the data up-to-date by providing the latest information about your deals. Optimally, the data should be updated daily. We suggest to integrate Sales Forecasting with your CRM system to automatically update the data.
- Pipeline stages: ensure that the pipeline stages are consistent across all deals. Avoid using multiple identifiers or names for the same stage, as this can lead to inaccurate predictions.
- Deal value: provide the value associated with each deal. This is a crucial input for accurate revenue predictions. The field is nullable since some deals may not have a value associated with them in the first stages of the pipeline, but make sure to fill in this information as soon as it becomes available. If the deal is won, the value should be the final value of the deal and it becomes required.
CSV file example
id,name,description,creationDatetime,addDatetime,closedDatetime,expectedClosureDatetime,status,stage,currentStageStartDatetime,value,numberOfSentEmails,numberOfEmailOpens,numberOfReceivedEmails,numberOfMeetings,lastInteractionDatetime,accountId,accountAverageResponseTime,companyId,companyNumberOfEmployees,companyRevenue,companyIndustry,companyType,companyDescription,companyTechnologies,companyCity,companyCountry,companyIsCustomer,contactId,contactNumberOfConversions,contactEmailDomainType,contactEmailOptedOut,contactAverageResponseTime,lastUpdateDatetime,lastSyncedAt
1,Customer A - annual supply,Partnership agreement,2024-05-01T08:00:00,2024-05-01T08:00:00,2024-06-10T09:15:30,2024-06-10T09:15:30,closed_won,Negotiation,2024-06-10T09:15:30,1000.00,5,20,5,10,2024-06-18T12:30:45,x,5,123,50,1000060.50,Technology,private,Leading AI solutions...,AI;Machine Learning;Web,Oslo,Norway,true,456,5,company,true,5,2024-06-18T12:30:45,2024-06-18T12:30:45
2,Customer B - software license,Subscription renewal,2024-04-15T10:30:00,2024-04-15T10:30:00,2024-05-20T14:45:00,2024-05-25T00:00:00,closed_lost,Proposal,2024-05-10T12:15:00,5000.00,10,35,8,6,2024-05-18T16:00:30,y,8,234,200,5000000.00,Finance,public,Financial solutions provider,Cloud;Big Data,New York,USA,true,789,3,company,false,4,2024-05-20T14:45:00,2024-06-18T12:30:45
3,Customer C - consulting services,Strategic partnership,2024-03-20T09:00:00,2024-03-21T08:00:00,,2024-07-01T12:00:00,open,Discovery,2024-06-01T10:00:00,25000.00,2,10,3,2,2024-06-15T15:45:20,z,6,345,1000,25000000.00,Consulting,private,Top consulting firm,Analytics;Cloud,London,UK,false,567,7,free_provider,true,7,2024-06-15T15:45:20,2024-06-18T12:30:45
4,Customer D - IT support,Long-term contract,2024-02-10T14:45:00,2024-02-11T09:30:00,2024-03-30T18:20:00,2024-04-01T00:00:00,closed_won,Final Review,2024-03-20T16:45:00,15000.00,7,25,4,5,2024-03-30T18:20:00,w,4,678,500,100000000.00,IT Services,private,Global IT solutions,DevOps;Security;Cloud,Berlin,Germany,true,890,12,company,false,6,2024-03-30T18:20:00,2024-06-18T12:30:45
5,Customer E - hardware procurement,Procurement agreement,2024-01-05T11:20:00,2024-01-06T14:00:00,,2024-08-15T10:00:00,open,Qualification,2024-06-10T09:00:00,7500.00,3,15,6,3,2024-06-12T11:30:00,v,9,910,120,7500000.00,Manufacturing,public,Leader in hardware manufacturing,Robotics;Automation;IoT,Tokyo,Japan,false,321,2,company,true,8,2024-06-12T11:30:00,2024-06-18T12:30:45