Machine Learning to Predict Sales and ROI at Points of Sale

The Client

The client is one of the world’s most famous tobacco companies. The Company distributes tobacco products across a set of geographically distributed Points of Sale (POS). To increase their net sales volume, the company uses a set of marketing techniques to increase the visibility of their products.

The Business Challenge

The client needed a solution to apply modern data mining techniques to optimize the company’s marketing processes and increase the sales per dollar spent on product promotion and advertising on the local market.
All the analytics for sales prediction were manually executed by the client’s sales personnel. Therefore, the client needed to automate the process for timely, faster, and comprehensive insights.


DataArt was chosen as a trusted technology consultant and development partner with 20 years of experience in building digital solutions for the retail industry. Due to the distributed global presence of our R&D offices, we were able to support the client on their local market with a cross-corporate team.

The aim of the project was to create a solution that would allow quick and timely access to statistical models based on the client’s datasets to solve the specified problems, namely predicting ROI and simulating marketing strategies to increase the net volume of sales.

All of the client’s existing marketing tools had associated expenses and caused a certain amount of impact on the sales process.

Prior to development, we worked with the client’s sales and marketing teams to identify the BI requirements, then document them and start iterative development. This enabled the client to use the application from the very beginning of the engineering process.

For modelling, we introduced regression analysis based on year-to-year comparisons.

The application processes various datasets like socio-demographic, geographic, e.g. POS location, and other variables. Also, seasonal, purchase behavior data, consumer solvency, and local market macroeconomic factors are considered.

We used cloud-based infrastructure for data processing. The application is based on Apache Spark technologies. It makes the solution highly scalable and makes it possible to work with big data. It also makes it possible to add various data sources like web logs, customer feedback from social media, marketing texts and more.

The solution is based on proven open source components and data mining libraries that provide a set of ready-to-use machine learning algorithms, and they provide the means for model testing and verification. 

The application provides the ability to make efficiency analyses of the marketing tools application with the ability to replay different scenarios and select an optimal marketing strategy. The configuration parameters have timeframes for a particular POS or group of POSs from the same or different geographical regions.

The solution enables dataset preparation procedures – data gathering, format unification, data cleanup and normalization and “feature” engineering. It enables research using suitable Machine Learning (ML) algorithms and models for “prediction” uplifts and ROI using a data mining toolchain (WEKA, scikit-learn, R). It also makes a research on suitable ML algorithms for “regression” problems  in simulating marketing scenarios with the resource allocation tool.

In general, the key advantages of the solution are:

  • Open source stack and proven efficient data mining tools and libraries.
  • Can be used with modern distributed computation environments and fits BigData–scale requirements.
  • Cloud-ready solution - can be easily deployed on AWS, Azure, or GCP.


The application developed by DataArt helps the client enhance their marketing and sales strategies on the local market by providing timely insights into how much and when to efficiently invest in sales promotions and other marketing campaigns. It helps the client achieve their sales goals.

The application provides sales and marketing managers with quick and timely access to all the necessary business information and analytics for better and more accurate:

  • Pricing model analysis and adjustment
  • Sales forecast and planning
  • Pricing impact on margin forecasting
  • Revenue growth management
  • Customer segmentation
  • Campaign targeting
  • Analysis of promotion impact
  • Customer loyalty monitoring and management
  • Channel management for each point of sale

As a result, the application enhances the sales strategy at each client’s point of sale and extends the client’s operating margins, helps increase earnings, and enlarges their market share.





Apache Spark


We are glad you found us
Please explore our services and find out how we can support your business goals.
Get in Touch Envelope