You are opening our English language website. You can keep reading or switch to other languages.

Speeding Up DevOps & SRE Incidents Resolution with AWS Bedrock

Company Name

Location

Netherlands

Client and Challenge

Fiberplane addresses the needs of SRE and DevOps teams with high up-time demands, providing programmable notebooks, real-time collaboration, and API integrations to streamline incident reviews, boosting effectiveness and resilience. It also allows to create a structured knowledge base for future correlated incident resolutions. 

Fiberplane sought to enrich their platform by integrating an engine capable of generating queries to Prometheus (a monitoring and metrics collecting tool). This feature aims to assist DevOps and SREs in gaining a clearer understanding of the most suitable labels to utilize.

Solution

DataArt’s team developed a GenAI-powered assistant that leverages natural language requests to generate Prometheus queries (PromQL) based on historical and context-aware query examples. The generated suggestions aid DevOps and SRE users of Fiberplane who will be able to interact with the system in the natural language, receiving suggestions without having to refer to runbooks. The team created a diverse and representative dataset for query validation that allowed them to achieve these results after several rounds of testing. 

To support this system, AWS Bedrock was utilized to enhance the natural language processing capabilities, while Amazon SageMaker Notebook Instances facilitated the development and training of machine learning models. The application relies on Amazon RDS for PostgreSQL to securely store and manage query data, with AWS Secret Manager ensuring the protection of sensitive information. AWS Lambda (SAM) was used to handle serverless compute tasks, and Amazon S3 provided scalable storage for datasets and model artifacts.

The main difficulty was thorough data preparation for the queries validation process in the complex multicomponent system. In order to overcome this, sources of possible supported requests were limited to four applications. Multistep filtering and subsequent improvement of the text descriptions for the queries were also applied in order to enhance the results and achieve higher accuracy of the final suggestions.

Outcomes

  • Achieved 81% accuracy rate of the generated suggestions
  • Natural language requests are converted into PromQL queries in order to retrieve environmental details from specialized applications
  • Enhanced code completion accuracy
  • Improved operational efficiency and faster code development

Technologies

AWS Bedrock
Amazon RDS for PostgreSQL
Amazon SageMaker Notebook Instances
AWS Lambda (SAM)
AWS Secret Manager
Amazon S3
Contact Us
Please provide your contact details, and we will get back to you promptly.