Our team will reach out within 24 hours to gather your project requirements, clarify your business objectives, and outline the next steps in our collaboration.
Client
The client is a leading data management provider for media creators and businesses.
Business Challenge
Managing music rights and royalties presents significant challenges due to fragmented, constantly evolving metadata. Identifying rights ownership is a complex process, often leading to incomplete records and revenue discrepancies.
To address these issues, the client turned to DataArt to develop a centralized platform for collecting, improving, and sharing music rights data among various industry stakeholders. The solution needed to serve multiple roles, including publishers, recording registrants, distributors, aggregators, record labels, digital service providers, rights holders, and music industry societies.
Meeting the Challenge
DataArt developed a cloud-native, AWS-powered solution that streamlines music metadata synchronization, licensing, and royalty processing. The solution ensures controlled data sharing based on peer-to-peer agreements, enabling secure, efficient, and transparent collaboration across the ecosystem.
The platform automates metadata integration, ensuring a seamless data flow across all industry participants.

The system operates on an event-driven architecture, automatically triggering processes in response to external events, such as file uploads to designated Amazon S3 buckets via SFTP clients. Once a file is uploaded, AWS Lambda functions initiate processing, executing workflows through AWS Batch or Step Functions.
Key Components of the Solution
The system's architecture is organized into several layers:
- ELT Layer: Leverages AWS Step Functions, Lambda, and Batch for efficient data processing and transformation.
- Internal Data Layer: Utilizes Snowflake, Amazon Aurora PostgreSQL, and Amazon S3 for optimized data storage and analytics.
- Service Layer: Runs microservices on Amazon ECS and AWS Fargate, allowing smooth API interactions. An Application Load Balancer optimizes traffic distribution, while Amazon CloudFront enhances content delivery.
- Client Layer: Provides a user-friendly interface built with JavaScript and React.
- Monitoring Layer: Uses Amazon CloudWatch for real-time system monitoring and logging.
- External Data Layer: Enables secure data transfers via SFTP and Amazon S3.
- Code and Infrastructure Management: Streamlines deployment with GitHub, GitHub Actions, Docker, and Terraform.
Outcomes
Leveraging AWS services, the platform enhances scalability, simplifies operations, and ensures real-time accuracy in rights data management.
Key achievements include:
- Automated metadata ingestion: Supports XLSX, CSV, DDEX, CWR, and BWARM file formats, integrating data into Snowflake.
- Advanced data modeling: Maintains confidentiality while enabling dataset validation, correction, and cross-stakeholder matching.
- Asset metadata processing: Supports distinct yet interlinked entities, allowing analysis of multiple versions across datasets.
- Optimized data flow: Automates the entire process from SFTP-based data ingestion to generating enriched output files with suggested matches.
- User-friendly UI: Allows users to process and manage data without direct interaction with AWS services.
Business Benefits
- Increased Efficiency: Automated workflows minimize manual effort, streamlining metadata management and royalty processing.
- Improved Accuracy: The data model enables better validation and correction, keeping rights data precise and up to date.
- Scalability and Flexibility: Cloud-based infrastructure ensures the platform scales dynamically to accommodate growing data volumes and evolving industry needs.
- Enhanced Security: The solution maintains strict data segregation, protecting confidentiality and enabling seamless collaboration.
- Cost Optimization: A Total Cost of Ownership (TCO) analysis confirmed that the AWS-based infrastructure helped reduce the TCO by 12% while maintaining high performance.
Technology
Python, FastAPI, AWS Boto3, Pytest, SQLAlchemy, Snowflake Python Connector, Psycopg2, Selenium, BeautifulSoup, Pandas, NumPy
JavaScript, ReactJS, Redux, HTML5, CSS3, Vite
Amazon Aurora PostgreSQL, Snowflake, dbt, Amazon S3
AWS Step Functions, Lambda, Batch, Fargate, CloudTrail, Amazon CloudWatch
Terraform, GitHub, GitHub Actions
