1. What is a QuickSight Dataset?
An Amazon QuickSight dataset is a structured collection of data used for visualization and analysis in QuickSight.
It connects to external data sources (Athena, S3, RDS, Redshift, etc.) and allows users to build dashboards and reports.
2. Key Functions of a Dataset
- Defines the data source (Athena, S3, RDS, Redshift, DynamoDB, etc.).
- Stores columns and data types (e.g.,
string
,integer
,datetime
). - Supports custom fields and calculated fields.
- Enables data transformations (filters, joins, aggregations).
- Allows scheduled updates for real-time or periodic data refresh.
3. Types of QuickSight Datasets
Amazon QuickSight supports two types of datasets:
Dataset Type | Description | Use Case |
---|---|---|
SPICE (Super-fast, Parallel, In-memory Calculation Engine) | Stores data in QuickSight’s in-memory engine for high-speed queries. | Best for large-scale reports with periodic updates. |
Direct Query | Fetches data directly from the source (Athena, RDS, etc.) at query time. | Best for real-time analytics with frequently changing data. |
SPICE is recommended for performance improvements, while Direct Query is ideal for real-time analysis.
4. Creating a QuickSight Dataset
Step 1: Connect to a Data Source
- Open AWS QuickSight.
- Go to the “Datasets” tab.
- Click “New Dataset”.
- Select a data source (Athena, S3, RDS, Redshift, etc.).
- Configure the connection details and create the data source.
Step 2: Create a Dataset
- Choose the created data source.
- Select “Custom SQL” or “Select a Table”.
- If using Custom SQL, enter a query.
Example Query (for aggregating data):
SELECT user_ip, COUNT(*) AS request_count
FROM access_logs
WHERE event_type = 'BLOCKED'
AND event_date = '2025-03-01'
GROUP BY user_ip
ORDER BY request_count DESC
LIMIT 10;
- Click “Confirm” → “Create Dataset”.
- Select SPICE or Direct Query based on your requirements.
- Click “Finish”.
5. Updating a Dataset
QuickSight datasets can be refreshed manually or scheduled for automatic updates.
Manual Refresh
- Open the Datasets page in QuickSight.
- Select the dataset.
- Click “Refresh Data”.
Scheduled Updates
- Open the Dataset Settings.
- Click “Set Scheduled Refresh”.
- Define the refresh frequency (e.g., Daily at 03:00 AM).
- Click “Save”.
SPICE allows scheduled updates, while Direct Query always fetches real-time data.
6. Summary
Feature | Description |
---|---|
Dataset Definition | A structured data source for QuickSight analysis. |
Data Sources | Supports Athena, S3, RDS, Redshift, and more. |
Query Type | SPICE (In-memory) or Direct Query (Real-time). |
Custom SQL | Allows filtering and data transformation. |
Update Options | Manual refresh or scheduled updates. |
7. Conclusion
- QuickSight datasets define the data structure for analysis.
- SPICE provides faster performance, while Direct Query ensures real-time updates.
- Custom SQL allows data filtering and transformation before visualization.
- Scheduling updates improves dashboard reliability and automation.