Comprehensive guide to installing and deploying Dremio on AWS

Dremio is a powerful SQL engine that is a cornerstone for data lakehouse architectures, enabling fast and effective data analysis. In the first part of our series we looked at the key features of this platform, now it´s time to move on. Deploying Dremio on AWS offers powerful data analytics and management capabilities, leveraging the scalability and flexibility of cloud infrastructure. This guide provides a detailed walkthrough for setting up Dremio directly or via the AWS Marketplace, incorporating essential AWS-specific steps to ensure a secure, optimised, and efficient deployment.

Preliminary Arrangements

Before initiating the installation, confirm your AWS account is active. If necessary, create an account at aws.amazon.com. Verify you have IAM permissions for creating and managing EC2 instances, VPCs, Security Groups, IAM roles, and key pairs, crucial for a secure and compliant deployment.

AWS Specific Preparations

Region Selection

Select the AWS region closest to your user base to minimise latency. This choice impacts the availability of instance types and features.

Key Pair Creation

For secure access to EC2 instances:

  1. Navigate to the EC2 dashboard in your selected region.
  2. Under “Network & Security,” choose “Key Pairs” then “Create Key Pair.”
  3. Name your key pair, download it, and store it securely.

VPC Setup

Configure a VPC for network isolation:

  1. Use an existing VPC or create a new one in the VPC dashboard that fits your network and security criteria.
  2. Attach an Internet Gateway to your VPC for external connectivity.
  3. Adjust route tables for correct internet traffic routing.

Security Group Configuration

Create a Security Group within your VPC:

  1. Add inbound rules to allow access on ports 9047 (Dremio UI) and 31010 (ODBC/JDBC), potentially restricting access to specific IPs for enhanced security.

Installation Methods

Direct Installation Method

  • EC2 Instance Setup: Launch an EC2 instance within your VPC using Amazon Linux 2 AMI and the created key pair. Choose a memory-optimised instance type like r4.large, and allocate sufficient storage.
  • Java Installation: Dremio requires Java 8. Install this on your EC2 instance with sudo yum install java-1.8.0-openjdk.
  • Dremio Installation: Download and install the Dremio RPM package, then start Dremio using sudo service dremio start.
  • Dremio UI Access: Navigate to http://<EC2-Instance-IP>:9047 to access the Dremio UI.

AWS Marketplace Installation Method

  • Subscription: Search for Dremio in the AWS Marketplace, select the Cloud version, and click “Continue to Subscribe” to review and accept terms.
  • Configuration: Post-subscription, configure the software settings, including region, instance type, and specify the prepared VPC and Security Group.
  • Deployment: Launch the instance from the AWS Marketplace console. AWS handles the deployment, providing a URL to access the Dremio UI upon completion.

Post-Installation Adjustments

After installation, connect Dremio to your data sources, set up users and permissions, and apply optimisation and security best practices.

AWS Marketplace Installation Screenshots

Dremio AWS Installation 1
Dremio AWS Installation 2
Dremio AWS Installation 3
Dremio AWS Installation 4
Dremio AWS Installation 5
Dremio AWS Installation 6
Dremio AWS Installation 7
Dremio AWS Installation 8
Dremio AWS Installation 9
Dremio AWS Installation 10
Dremio AWS Installation 11
Dremio AWS Installation 12
Dremio AWS Installation 13
Dremio AWS Installation 14
Dremio AWS Installation 15

Conclusion

By following this comprehensive guide, IT professionals can use Dremio effectively to ensure a well-prepared environment for data analyses. Read the official Dremio documentation regularly and participate in the community discussions to stay up to date and resolve any issues.

Additional resources

Official Dremio documentation

Dremio Community Forums

Unlock the full potential of your data with Dremio’s seamless lakehouse capabilities.
Don’t let complexity slow down your data analysis. Start optimising your data environment today. For any queries or to discuss how Dremio can specifically benefit your infrastructure,
contact our expert team.

Elevate your data management — your journey to a streamlined lakehouse architecture begins here.