Meeting the Challenge
The overall transformation was affected by a lack of documentation, resources, and people who could answer questions. Essentially, the entire solution upgrade was treated as a reverse engineering exercise, with DataArt’s team required to decipher the source code, the details of implementation and configuration, and the deployment model, all without any technical insight.
The following approach was used to resolve the issues:
Data processing, storage, and all services were migrated to the AWS (Amazon Web Services) environment. Data processing was done by deploying the solution into a DC/OS cluster based on AWS resources. All deployable components are now delivered in Docker containers to ship them to a newly created GuestMetrics Ops team and effect the deployment process more easily.
Instead of maintaining a Hive cluster-based solution using a mix of Java, Bash, SQL, HDFS, and Hive, the entire solution was converted into a portable data processing pipeline based on Apache Spark. This made it possible to eliminate dependencies and, more importantly, run the data pipeline continuously every six hours instead of once a month. As a result, the data pipeline met the client’s platform requirements.
The legacy platform was transformed using R and a Shiny server for visualizing data. The development team decided to keep the platform as a subcomponent of a more scalable and easily extendable solution due to the complex and time-costly migration of the Dashboard system. This made it possible to enable SSO integration and customization based on individual client needs.
Business Benefits
The DataArt team transformed the Pipeline into a new, fast, cost-effective, and optimized technological solution capable of quickly processing transactions. The solution was migrated to AWS, thus eliminating the hardware dependency and minimizing maintenance costs. Supplementary scalability was achieved using docker containerization.
The earlier, slow data processing system prevented the Client from developing their business and from providing their customers with data promptly.
The new and more effective data processing pipeline helped the Client achieve their business goals and successfully meet customer requirements.
Technology
The high level component architecture of the data pipeline:

DataArt performed in-depth research and provided full comprehensive documentation for each step and component of the transformed solution.
