When you want to better visualize AWS logs, it is common to use Athena and QuickSight.
This time, I will write about the standard approach using Athena and QuickSight.
1. Storing WAF Logs in S3
To query AWS WAF logs using Athena, the logs must first be stored in an Amazon S3 bucket.
Enable WAF Logging
- Go to the AWS WAF console and navigate to your WebACL.
- Select Logging and metrics and enable logging.
- Configure Amazon Kinesis Data Firehose as the delivery stream to store logs in S3.
- Choose an S3 bucket as the final destination for logs.
- Set up compression (Gzip recommended) to optimize storage.
2. Querying WAF Logs with Athena
Athena can be used to analyze JSON-formatted WAF logs stored in S3.
Create a Glue Crawler (Recommended)
- Open the AWS Glue console and create a new crawler.
- Set the data source as the S3 path containing WAF logs.
- Define a new database for WAF logs.
- Run the crawler to create a structured Athena table.
Manually Create an Athena Table (Alternative)
Run the following SQL command in Athena to define a table for WAF logs:
CREATE EXTERNAL TABLE waf_logs (
timestamp string,
formatVersion int,
webaclId string,
action string,
ruleGroupList array<struct<ruleId:string, action:string>>,
terminatingRuleId string,
terminatingRuleType string,
rateBasedRuleList array<struct<rateBasedRuleId:string, count:int>>,
httpSourceName string,
httpSourceId string,
ruleAction string,
httpRequest struct<
clientIp:string,
country:string,
headers array<struct<name:string, value:string>>,
uri string,
args string,
httpVersion string,
httpMethod string,
requestId string
>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://your-waf-logs-bucket/'
TBLPROPERTIES ('has_encrypted_data'='false');
Run Queries in Athena
Example: Count the number of requests per client IP.
SELECT httpRequest.clientIp, COUNT(*) as request_count
FROM waf_logs
GROUP BY httpRequest.clientIp
ORDER BY request_count DESC
LIMIT 10;
3. Integrating Athena Queries with QuickSight
To visualize the WAF log data, Athena’s results can be connected to Amazon QuickSight.
Create a QuickSight Dataset
- Open Amazon QuickSight.
- Go to Datasets and select Athena as the source.
- Provide a data source name and select the Athena data catalog.
- Choose the appropriate database and table (
waf_logs
). - Save and prepare the dataset.
Visualizing Data in QuickSight
- Client IP Requests: Use a bar chart to show the top IPs sending the most requests.
- Request Trends: Use a line chart to display the number of requests over time.
- Blocked Requests Percentage: Use a pie chart to show allowed vs. blocked requests.
Schedule Data Refresh
- Set the dataset refresh frequency to hourly or daily, depending on your needs.
4. Optimization Tips
To optimize Athena query performance and reduce costs:
- Partitioning: Store data partitioned by
year
,month
, andday
. - Parquet Conversion: Convert JSON logs to Parquet format to speed up queries.
- Reduce Query Cost: Minimize data scanning by filtering queries based on timestamps.
5. Alternative Approach for Real-Time Analysis
For real-time visualization, consider using Amazon OpenSearch Service instead of Athena:
- Use Kinesis Data Firehose to stream logs into OpenSearch.
- Create dashboards with Kibana.
- Use QuickSight for aggregated reports if necessary.
This setup allows for efficient WAF log analysis using AWS Athena and visualization in Amazon QuickSight.