What is hermes-agent-aws?

hermes-agent-aws is an AI agent skill by Shahriar Labs that deploys a private, always-on AI agent on AWS (ECS Fargate + DynamoDB + API Gateway) in a single command — with persistent memory, free-LLM failover via OpenRouter, and a REST API for integration.

Why run a private AI agent on AWS instead of using a hosted service?

Private agents give you: full control over data (no third-party data retention), custom tool integrations (your internal APIs, databases), persistent memory tailored to your use case, and consistent availability without usage limits. Hosted agent services restrict integrations and retain conversation data.

What does hermes-agent-aws cost to run?

The baseline AWS infrastructure (ECS Fargate 0.25 vCPU / 0.5 GB RAM, DynamoDB on-demand, API Gateway) costs approximately $17/month with moderate usage. LLM API costs add on top — using free-model fallback via openrouter-free reduces this significantly for non-critical paths.

What AWS services does hermes-agent-aws use?

ECS Fargate for containerized agent runtime, DynamoDB for persistent memory and session storage, API Gateway for REST endpoint, SQS for async task queuing, Secrets Manager for API keys, and CloudWatch for logging and alerts.

hermes-agent-aws: Private AI Agents on AWS

hermes-agent-aws deploys a private, always-on AI agent on AWS in one command — persistent memory, free-LLM failover, and a REST API for ~$17/month. Built by Shahriar Labs, it solves the "always-on agent" problem: you need an agent that's reachable, remembers context, and doesn't reset state between calls — without paying $100+/month for a managed agent platform.

Architecture

hermes-agent-aws provisions:

ECS Fargate: Containerized agent runtime. Scales to zero when idle, spins up in ~30s on cold start. Stateless compute — state lives in DynamoDB.
DynamoDB: Persistent memory store. Session memory (TTL-based), long-term memory (no TTL), and tool output cache (short TTL).
API Gateway: REST endpoint for agent invocation. Supports sync (wait for response) and async (queue task, poll for result) modes.
SQS: Task queue for async invocations and long-running agent jobs.

Infrastructure is Terraform-provisioned. One terraform apply deploys the full stack. Teardown with terraform destroy.

Free-LLM Failover

hermes-agent-aws integrates openrouter-free routing: primary calls go to your paid model (Claude, GPT-4), and fallback paths automatically route to free models (DeepSeek R1, Qwen 3) for non-critical tasks or when the primary is rate-limited. This keeps the agent operational at all times and reduces API costs by 40–60% for high-volume use cases.

When to Use hermes-agent-aws

Use it when you need: an agent that integrates deeply with your internal AWS infrastructure (RDS, S3, Lambda), persistent memory across sessions for a specific user or workflow, a private endpoint without sharing data with hosted agent providers, or always-on availability for business-critical agent tasks. For multi-tenant SaaS architectures see our post on multi-tenant SaaS on AWS ECS.

Frequently Asked Questions

What is hermes-agent-aws?: An AI skill that deploys a private always-on AWS agent (ECS + DynamoDB + API Gateway) with persistent memory for ~$17/month.
Why run private instead of hosted?: Data control, custom tool integrations, no usage limits, consistent availability.
What does it cost?: ~$17/month baseline AWS infra. LLM API costs extra — free-model fallback reduces this significantly.
What AWS services are used?: ECS Fargate, DynamoDB, API Gateway, SQS, Secrets Manager, CloudWatch.

Written by Shihab Shahriar Antor — AI Engineer & Founder of Shahriar Labs. Building LetX, QuantumSketch, and open-source AI agent skills.

hermes-agent-aws: Private AI Agents on AWS

Architecture

Free-LLM Failover

When to Use hermes-agent-aws

Frequently Asked Questions

Related Articles

softco: Turn AI Agents Into a Software Firm

skill-builder: Build AI Agent Skills From Plain English