Bedrock

Bedrock Rightsizing

This recommendation identifies if the provisioned Model Units (MUs) can be adjusted based on actual usage trends. Reduce provisioned throughput where utilization is consistently low, where persistent throttling or performance lag is observed.

Key Decision Factors

  1. Throttling Detection: Any throttled requests trigger scale-up recommendations
  2. Utilization Threshold: Usage below 50% triggers scale-down recommendations
  3. Savings Threshold: Only recommendations with ≥5% potential savings are included
  4. Capacity Buffer: Scale-down recommendations include 10% spare capacity

Bedrock Provisioned Throughput Commitment

This recommendation analyzes your usage patterns and provisioned throughput of Bedrock model endpoints to identify opportunities to commit to a lower, cost-effective level of provisioned throughput without impacting performance, helping reduce unnecessary spend.

Provisioned throughput Overview:

Provisioned throughput pricing is designed for situations where you need rock-solid, consistent performance. With this model, you reserve capacity ahead of time, which guarantees throughput and smooth performance. You’re billed hourly per model unit. With provisioned throughput pricing, you can choose between 1-month and 6-month commitments. The longer the commitment, the better the rate.

One good use case for a provisioned throughput model is a machine translation service that constantly processes a high volume of data. Here’s a pricing example:

  • Workload: Predictable, at 1,000,000 input tokens per hour
  • Commitment: You make a 1-month commitment for 1 unit of a model, which costs $39.60 per hour.

Provisioned throughput pricing gives you the peace of mind of consistent performance at a predictable cost–great for large, steady workloads.

When purchasing Provisioned Throughput, you can choose from the following commitment durations:

  • No Commitment: Pay hourly with the flexibility to delete the provisioned throughput at any time.AWS Documentation
  • 1-Month Commitment: Lower hourly rates compared to no commitment. You cannot delete the provisioned throughput until the one-month term concludes.AWS Documentation
  • 6-Month Commitment: Offers the most significant discount per hour. The provisioned throughput cannot be deleted until the six-month term ends.

These commitments are beneficial for workloads with consistent, high-volume inference needs, providing cost efficiency and guaranteed throughput