AWS Weekly Roundup (Jan 26, 2026): EC2 G7e with NVIDIA Blackwell, Corretto Security Updates, and the Quiet Upgrades That Matter

AI generated image for AWS Weekly Roundup (Jan 26, 2026): EC2 G7e with NVIDIA Blackwell, Corretto Security Updates, and the Quiet Upgrades That Matter

AWS has a special talent for shipping headline-grabbing infrastructure one day and then—almost casually—dropping a handful of smaller features that end up saving teams real money and real sleep. The AWS Weekly Roundup: Amazon EC2 G7e instances with NVIDIA Blackwell GPUs (January 26, 2026) is a perfect example of that pattern.

The roundup (written by Micah Walter) spotlights a new GPU instance family designed for inference and graphics workloads, a fresh set of Amazon Corretto security updates for Java shops, and a few under-the-radar improvements across containers, observability, and contact center workflows. It’s the kind of week where cloud architects get excited about the big silicon—and platform engineers quietly whisper “finally” at an ECR improvement that trims minutes off CI/CD.

Below is a deeper, journalist-style unpacking of what shipped, why it matters, and how it fits into the broader 2026 cloud and AI landscape—without copy-pasting the roundup (because you already have the link and because plagiarism is not a feature).

1) The big launch: EC2 G7e instances with NVIDIA Blackwell (GA)

Let’s start with the obvious main character: Amazon EC2 G7e instances. As of January 20, 2026, these instances are generally available, accelerated by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, and initially offered in US East (N. Virginia) and US East (Ohio). AWS’s “What’s New” post frames them as the new sweet spot for generative AI inference plus graphics-heavy workloads such as spatial computing. citeturn0search6

AWS also published a full announcement blog with detailed performance claims and a spec table. The headline claim is up to 2.3× better inference performance versus G6e, along with more GPU memory, faster networking, and improved GPU-to-GPU communication support. citeturn0search1

What “G7e” is aiming to be (and what it is not)

The “G” family in EC2 traditionally maps to graphics and mixed workloads (graphics + compute), as opposed to AWS’s more training-centric, ultra-high-end GPU offerings. G7e continues that positioning, but it’s very clearly shaped by the 2026 reality: even teams that used to buy GPUs for visualization now want to run LLMs, multimodal models, and agentic systems—often in production inference mode, sometimes in light fine-tuning mode, and occasionally as part of simulation pipelines.

AWS explicitly calls out these target workloads: deploying LLMs, agentic AI, multimodal models, and even physical AI (robotics-adjacent workloads). citeturn0search6

Specs that actually matter: memory, bandwidth, and a “single node” story

AWS’s G7e product page reads like it was written by someone who has seen too many teams try to cram modern models into yesterday’s GPU memory budgets. The big numbers:

  • Up to 8 GPUs per instance
  • 96 GB GPU memory per GPU, up to 768 GB total GPU memory
  • Up to 192 vCPUs and up to 2 TiB (2048 GiB) system memory
  • Up to 1600 Gbps networking bandwidth with Elastic Fabric Adapter (EFA) on the largest size
  • Up to 15.2 TB of local NVMe SSD storage

Those numbers are not marketing fluff—they are the difference between running a model comfortably in one place versus spreading it across a more complex multi-node architecture (and then spending your weekend debugging collective communications). citeturn0search2

AWS also highlights a “single GPU can handle medium models” angle: with the higher memory, you can run a model up to ~70B parameters with FP8 precision on a single GPU (as described in the roundup and the G7e announcement content). citeturn0search0turn0search1

Instance sizes (from small to “please finance approve this”)

AWS provides a clean table of sizes, from g7e.2xlarge (1 GPU) up to g7e.48xlarge (8 GPUs). The biggest size stacks the full set of headline specs: 8 GPUs, 768 GB GPU memory, 192 vCPUs, 2048 GiB RAM, and 1600 Gbps networking. citeturn0search2turn0search4

In practical terms, that means the smallest sizes are plausible for teams doing small-scale inference or development, while the largest sizes look designed for:

  • High-throughput inference of medium-to-large models
  • Multi-GPU inference with low-latency GPU-to-GPU communication
  • Graphics + AI workloads (simulation + neural rendering)
  • Cost-optimized, single-node fine-tuning or training for smaller GenAI models

GPUDirect, EFA, and why AWS keeps talking about bandwidth

Two of the most interesting parts of AWS’s G7e narrative are the emphasis on NVIDIA GPUDirect features and on networking uplift.

AWS says G7e supports NVIDIA GPUDirect Peer-to-Peer (P2P) for direct GPU communication over PCIe—useful when you split a model across multiple GPUs because one GPU can’t fit the entire working set. AWS also says the multi-GPU G7e sizes support GPUDirect RDMA with EFAv4 in EC2 UltraClusters, which matters when you do scale-out work across nodes and want to minimize the “network tax” on GPU workloads. citeturn0search6turn0search1

There’s also mention of GPUDirectStorage with Amazon FSx for Lustre and a throughput claim (up to 1.2 Tbps in the announcement blog). That particular combination—fast parallel file system + GPU-direct data movement—is a known pattern in HPC and increasingly in AI, where loading checkpoints and massive datasets can be as painful as the compute itself. citeturn0search1

Why Blackwell shows up in “graphics” instances

One subtle thing here: these are NVIDIA Blackwell-based GPUs, and AWS leans into both AI inference and spatial computing. That’s a signal about where GPU demand is heading. Many organizations are not choosing between “graphics GPU” and “AI GPU” anymore—they are building products that combine both:

  • Digital twins that need rendering plus AI-driven behavior
  • Robotics simulation with physics + perception models
  • Avatar interfaces and 3D environments powered by LLMs
  • Industrial visualization pipelines with AI-assisted anomaly detection

AWS calls out ray tracing improvements and “workloads that combine graphics and AI” on the G7e instance type page. citeturn0search2

Who should care: the “inference majority”

Not every team needs the biggest training clusters, but almost every team building production GenAI needs inference capacity that is:

  • Predictable under load
  • Memory-rich enough for modern models
  • Networked well enough for multi-GPU and occasional multi-node
  • Cost-manageable (or at least cost-explainable)

G7e looks like AWS doubling down on a category we can call the “inference majority”: workloads where latency, throughput, and stability matter more than setting training benchmarks. The 2.3× inference claim versus G6e is the kind of number procurement teams like, even if engineering teams will want to validate it against their own models and batch sizes. citeturn0search1

2) Amazon Corretto January 2026 quarterly updates: the security patch train keeps moving

At the other end of the glamour spectrum sits a set of updates that most enterprises need more than they want: Amazon Corretto quarterly updates.

On January 20, 2026, AWS announced quarterly security and critical updates for Corretto Long-Term Supported (LTS) OpenJDK versions. The updated versions include:

  • Corretto 25.0.2
  • Corretto 21.0.10
  • Corretto 17.0.18
  • Corretto 11.0.30
  • Corretto 8u482

That list is worth paying attention to if your organization spans multiple Java generations (and many do). AWS reiterates that Corretto is a no-cost, multi-platform, production-ready distribution of OpenJDK, with updates available via direct download and via standard Linux repos. citeturn0search5

Why this matters in 2026: “boring” runtime updates are now a supply-chain story

In a world where software supply chain security is no longer optional, keeping runtimes patched is part of your organization’s baseline risk management. Java remains deeply embedded in banking, retail, logistics, and government systems; even teams building shiny new AI services often run them behind Java-based APIs, integration layers, or streaming pipelines.

Corretto’s quarterly cadence makes it easier for organizations to align patching with change windows. And because Corretto tracks OpenJDK updates, these releases are essentially AWS shipping the latest upstream security and critical fixes in a form enterprises can operationalize.

If you’re running on AWS and want a “supported by AWS” story for your Java runtime, Corretto remains one of the most straightforward choices—especially for teams that don’t want to manage Oracle Java licensing complexity or build their own OpenJDK distribution pipeline.

3) Amazon ECR cross-repository layer sharing (blob mounting): small feature, big CI/CD impact

Now for the unsung hero of this week: Amazon ECR blob mounting, which enables cross-repository layer sharing within a registry.

As of January 20, 2026, AWS says ECR can share common image layers across repositories through “blob mounting.” The promise is simple: faster pushes (because you reuse layers instead of uploading duplicates) and lower storage costs (because you store common layers once). This is particularly relevant for microservices fleets built on common base images. citeturn1search1

What blob mounting actually does (and what it doesn’t)

AWS documentation explains that when registry blob mounting is enabled, ECR checks for existing layers in your registry during push operations (when the client includes mounting parameters). If a layer exists in another repository, ECR mounts the existing layer rather than storing a second copy. citeturn1search4

There are constraints that matter operationally:

  • Blob mounting works only within the same registry (same AWS account and Region). citeturn1search4
  • Repositories must use identical encryption type/keys. citeturn1search4
  • It’s not supported for images created via pull through cache. citeturn1search4
  • If you disable it later, images pushed with blob mounting will still work; layers remain mounted. citeturn1search4

In other words: this is not cross-account dedupe magic and not a global cache—but within a typical “single account per environment” layout, it can meaningfully reduce repeated work.

Why platform engineers will feel this immediately

Teams that operate container-heavy platforms usually have at least one of these patterns:

  • Dozens (or hundreds) of repos with the same base layers
  • Frequent rebuilds due to security patching (glibc, OpenSSL, base OS updates)
  • CI pipelines pushing images repeatedly across services

Blob mounting directly targets the “why are we uploading the same base image layers again?” problem. Faster pushes aren’t just about developer impatience; they reduce pipeline duration, shrink the time to roll out urgent fixes, and cut down on the operational noise of build systems.

AWS says blob mounting is available in all AWS commercial and AWS GovCloud (US) Regions, which is notable because regulated workloads often have even more rigid patching expectations and slower pipelines. citeturn1search1

4) CloudWatch Database Insights expands to four more regions: ML-assisted diagnosis moves closer to “default”

Database observability is one of those topics that sounds niche until your primary database starts behaving like it’s trying to communicate exclusively through sighs.

AWS announced that CloudWatch Database Insights on-demand analysis is now available in four additional regions: Asia Pacific (New Zealand), Asia Pacific (Taipei), Asia Pacific (Thailand), and Mexico (Central). This expansion was posted on January 20, 2026. citeturn1search2

What Database Insights on-demand analysis does

AWS describes Database Insights as a monitoring and diagnostics solution that provides visibility into metrics, query analysis, and resource utilization patterns. The on-demand analysis experience uses machine learning to identify performance bottlenecks for a selected time period, compare it against baseline behavior, identify anomalies, and provide remediation advice—reducing mean-time-to-diagnosis from hours to minutes (as claimed in the AWS post). citeturn1search2

A key practical detail: AWS says you enable this by turning on the Advanced mode of CloudWatch Database Insights for Amazon Aurora and Amazon RDS databases (via consoles, APIs, SDK, or CloudFormation). citeturn1search2

Industry context: “ML inside ops tools” is now expected, not experimental

Observability vendors have spent years layering ML onto dashboards—sometimes with mixed results. AWS’s approach here is pragmatic: if you already have a pile of metrics and query performance data, the pain is in correlation and root-cause analysis. Automating the comparison to baseline and highlighting anomalies is the kind of task ML can be reasonable at, especially when it’s guiding a human DBA rather than making autonomous changes.

The bigger picture is that AWS continues to fold higher-level operational intelligence into CloudWatch, not just raw telemetry. That’s part of a broader trend toward “ops copilots”—tools that reduce cognitive load during incidents.

5) Amazon Connect Step-by-Step Guides: conditional logic and real-time updates

AWS also shipped improvements to Amazon Connect Step-by-Step Guides on January 23, 2026. The update adds:

  • Conditional UI logic (show/hide fields, change defaults, adjust required fields based on prior inputs)
  • Real-time data refresh from Connect resources at specified intervals

This is positioned as a way for managers to build more dynamic guided experiences for agents, reducing workflow friction and keeping information current. citeturn1search0

Why a “UI logic” feature is more strategic than it sounds

Contact centers are increasingly software-defined workflows. When you can implement conditional logic and real-time refresh in the guide layer, you reduce the need for custom agent tooling and minimize the “tribal knowledge” problem where only your best agents know which branch of a process to follow.

The region list is also broad (including GovCloud US-West), which hints that AWS sees these workflow tools as relevant not just for commercial call centers, but also for regulated customer service operations. citeturn1search0

6) Upcoming event: Best of AWS re:Invent (Jan 28–29, 2026)

The roundup also points readers to a near-term virtual event: Best of AWS re:Invent, running January 28–29, 2026 depending on time zone. The event is free, virtual, and features an opening session by Jeff Barr plus curated sessions and live Q&A. citeturn1search3turn0search0

If you’re reading this from the United States, the AMER broadcast is January 28, 2026 at 9:00 AM Pacific Time (per the event page). citeturn1search3

Why this event matters: re:Invent announcements have a long tail

Even if you followed re:Invent news in real time, “Best of re:Invent” is often where technical teams actually absorb the material and map it onto roadmap decisions. AWS’s own description focuses on translating announcements into “competitive advantage,” but the operational value is simpler: it’s a chance to understand what’s production-ready versus what’s early-access-ish, and what will realistically fit your organization’s skills and governance.

7) Putting it together: what this roundup says about AWS’s 2026 priorities

One week’s announcements do not define a year, but the combination in this roundup is revealing. It suggests AWS is continuing to push on three fronts:

7.1 AI infrastructure: inference-first, production-first

G7e is designed around deployment realities: memory capacity, bandwidth, and multi-GPU performance features. The messaging leans heavily into inference, while still leaving room for cost-optimized fine-tuning/training. citeturn0search2turn0search1

7.2 Platform engineering: make the “container factory” less wasteful

ECR blob mounting is a platform-engineering play. It doesn’t change the world, but it cuts down duplicated work across repositories, which scales directly with the number of services you run. citeturn1search1turn1search4

7.3 Operations and business apps: guided intelligence over raw tooling

Database Insights on-demand analysis and Connect Step-by-Step Guides improvements both aim to reduce human toil. One helps you diagnose database behavior with ML assistance; the other helps you standardize and speed up customer service workflows through dynamic UIs and current data. citeturn1search2turn1search0

8) Practical takeaways and next steps (without pretending every reader is an AWS Solutions Architect)

If you run GenAI inference workloads

  • Evaluate G7e if you are currently constrained by GPU memory or inter-GPU bandwidth on older instance families. The spec improvements (96 GB per GPU; up to 8 GPUs; faster networking) are aimed directly at those bottlenecks. citeturn0search2turn0search4
  • Validate the “2.3× inference performance” claim with your own model, precision mode, batching strategy, and latency SLOs. AWS’s number is a strong directional signal, but benchmarking is still a sport you must play yourself. citeturn0search1
  • Consider architecture simplification: if a model can fit on a single GPU (AWS references ~70B parameters with FP8), you may reduce complexity versus multi-node designs. citeturn0search1turn0search0

If you maintain Java services

  • Plan Corretto updates into your quarterly patch cycle, especially across multiple LTS lines (8, 11, 17, 21, 25). citeturn0search5
  • Use repository-based updates (Apt/Yum/Apk) if you want more consistent fleet management and fewer “one host is special” incidents. citeturn0search5

If you ship containers at scale

  • Enable ECR blob mounting if you have many repos sharing base layers and you want faster pushes plus less duplicated storage. citeturn1search1
  • Check constraints: same account and region; consistent encryption; no pull-through cache support for blob mounting. citeturn1search4

If your database performance incidents are too “artisanal”

  • Consider turning on Advanced mode of CloudWatch Database Insights for Aurora/RDS fleets to leverage the on-demand analysis workflow, especially if you’re in newly supported regions like Mexico (Central) or APAC (Taipei/Thailand/New Zealand). citeturn1search2

Sources

Bas Dorland, Technology Journalist & Founder of dorland.org