CloudTrail records everything but alerts on nothing. The audit trail is valuable for investigation after an incident, but without an alerting layer on top of it, the log that shows someone deleted your production database or created a backdoor IAM user sits in S3 unread while damage accumulates. Building alerting on CloudTrail means deciding which API calls are security-relevant enough to warrant real-time notification and choosing the right detection mechanism for each.
Two mechanisms dominate CloudTrail-based alerting: Amazon EventBridge and CloudWatch Logs metric filters. Understanding their different characteristics — latency, flexibility, cost, complexity — determines which to use for each alert type.
EventBridge vs. CloudWatch Metric Filters
EventBridge receives CloudTrail management events with roughly 1–2 second latency. You write rules that match specific event patterns — event source, event name, or any field in the CloudTrail event JSON — and EventBridge triggers targets (SNS, Lambda, SQS) when a rule matches. EventBridge rules support rich pattern matching including field-level comparisons, array containment checks, and prefix matching. For most real-time alerting use cases, EventBridge is the right choice.
CloudWatch Logs metric filters work differently: CloudTrail delivers logs to CloudWatch Logs (with 5–15 minute latency), metric filters scan incoming log lines for patterns and increment custom metrics when matches occur, and CloudWatch alarms trigger when metric values breach thresholds. This pipeline has higher latency and more moving parts than EventBridge, but it provides the ability to alert on the absence of an event (using CloudWatch alarm "missing data" functionality), which EventBridge cannot do. CloudWatch Logs metric filters are valuable for the subset of security alerts where you need to detect that something didn't happen.
The 12 API Events That Warrant Immediate Alerting
Root account console login (ConsoleLogin from root identity): Any root account login should trigger an immediate alert. The EventBridge pattern matches userIdentity.type == "Root" and eventName == "ConsoleLogin". Route to PagerDuty or equivalent for immediate response.
DeleteTrail or StopLogging: Disabling CloudTrail is a standard attacker cleanup step. Alert immediately. These events should trigger an automated response that re-enables logging in addition to notifying the security team.
CreateUser or AttachUserPolicy with broad permissions: The creation of new IAM users — especially combined with admin policy attachment — is a privilege escalation signal. Alert on CreateUser and correlate with any subsequent AttachUserPolicy events.
CreateAccessKey for root or existing users: Creating new access keys for existing users (especially those with elevated privileges) is a persistence mechanism. Alert on all CreateAccessKey events.
PutBucketPolicy or PutBucketAcl on sensitive buckets: Changes to S3 bucket policies and ACLs are a common data exfiltration setup. Alert on policy changes to buckets containing sensitive data.
AuthorizeSecurityGroupIngress with broad CIDR: Opening security group rules to 0.0.0.0/0 or large CIDR ranges warrants notification. Filter on the ipPermissions.ipRanges field to catch overbroad rules.
DeactivateMFADevice or DeleteVirtualMFADevice: Removing MFA from any account is a security regression. Alert on these events for all principals, especially the root account.
GetSecretValue at unusual hours or from unexpected principals: Secret access by principals with no history of accessing that secret, particularly at unusual hours, warrants investigation. This requires correlation logic beyond a simple EventBridge rule — a Lambda processor can apply this context.
AssumeRoleWithWebIdentity from unexpected sources: Web identity federation assertions from unexpected identity providers can indicate OIDC token abuse. Alert on assertions from providers not in your expected list.
CloudTrail configuration changes: Any modification to CloudTrail trails — UpdateTrail, PutEventSelectors, RemoveTags — should be treated as a security-relevant event. Attackers who compromise an account often modify trail settings to reduce logging scope.
Alert Routing Architecture
Route alerts to channels appropriate to their severity. Root account logins, DeleteTrail, and CreateUser events warrant PagerDuty or similar on-call notification regardless of the hour. Security group changes, access key creation, and S3 policy changes can route to a Slack security channel for business-hours review.
Lambda functions provide richer routing logic than SNS alone. A Lambda processor can enrich alerts with context (cross-reference the API caller against a known automation role list, check if the action matches a scheduled change window, add the account name rather than just the account ID) before routing to the appropriate channel. The additional processing latency — typically under a second — is acceptable for security alerting.
Suppressing Known-Good Patterns
Every alert type above generates false positives. Scheduled automation creates IAM users. Legitimate administrators modify security groups. Deployment pipelines update bucket policies. Build suppression into your EventBridge rules rather than accepting the noise.
EventBridge rule patterns support negative matching. An AttachUserPolicy alert rule can exclude events where the userIdentity.sessionContext.sessionIssuer.userName matches your known automation roles. An AuthorizeSecurityGroupIngress rule can exclude events where the requestParameters.groupId is a known development security group. Start with broad alerts and narrow them as you identify the false positive patterns specific to your environment.
Related Reading
- CloudTrail best practices — complete audit logging configuration
- IAM security monitoring — root account protection and policy drift detection
- GuardDuty setup — complementary threat detection alongside CloudTrail alerting
FAQ
How do I test my CloudTrail alerting rules without generating real security events?
Use AWS CloudTrail's PutEventsSelectors API to send test events that match your rule patterns, or use EventBridge's test event functionality to inject synthetic CloudTrail-shaped events. For rules that should fire on root account activity, consider creating a test procedure in your runbooks that uses the root account for a legitimate low-risk action (reviewing billing) and confirms the alert fires as expected.
What is the latency from API call to EventBridge-triggered alert?
CloudTrail delivers management events to EventBridge with approximately 1–2 second latency from when the API call occurs. EventBridge rule matching and target invocation adds another 1–5 seconds. Total end-to-end latency from API call to SNS notification is typically 5–15 seconds — fast enough for meaningful real-time response.
Should I build all alerting in EventBridge or use a SIEM instead?
For AWS-specific alerting, EventBridge rules are lower-latency, lower-cost, and require less operational overhead than a SIEM. Use EventBridge for AWS security event alerting. A SIEM adds value when you need to correlate AWS events with non-AWS data sources (on-premises logs, SaaS application events, endpoint telemetry). If your environment is AWS-only, a SIEM is unlikely to justify its cost for CloudTrail alerting alone.
Protect your AWS accounts before it's too late
Vigilare monitors your AWS accounts for suspension risks — billing anomalies, IAM issues, GuardDuty findings, and more — and alerts you before AWS takes action.
Written by Vigilare Engineering
Platform Team