Amazon S3

Created by Rahul Pattamatta, Modified on Wed, 14 Aug at 2:17 AM by Rahul Pattamatta

Amazon S3Getting Started with S3 Source Configuration in Databrain

Requirements:

Setup Guide:

Ensure Bucket Accessibility:
- Make sure your S3 bucket is active and accessible from DataBrain.
- This depends on your AWS account settings and bucket permissions.
Grant Necessary Permissions:
- Read Access on Buckets and Objects: Grant read access permissions to the S3 buckets and objects you want to sync.
Fill Up Connection Info:
- Provide the following information to connect to your S3 bucket:
  - Destination Name: A custom name to identify this connection in Databrain.
  - S3 Region: The AWS region where your S3 bucket is located (e.g., us-east-1).
  - S3 Access Key ID: Your AWS Access Key ID for authentication.
  - S3 Secret Access Key: Your AWS Secret Access Key associated with the Access Key ID.
  - S3 Bucket Dataset Folder Path: The specific folder path within your bucket (e.g., awss3_folder_test_less/).
  - S3 Bucket Name: The name of your S3 bucket (e.g., databrain-s3-test-csv).
  - Table Level: Select whether to interpret data at the Folder or File level.

Permissions:

Locating the Configuration Details in AWS S3

Destination Name:
- Choose a descriptive name for this connection within Databrain.
S3 Region:
- Log in to the AWS Management Console and open the S3 service.
- Select your bucket, and find the region information in the bucket's "Properties" tab.
S3 Access Key ID & Secret Access Key:
- Generated in the IAM (Identity and Access Management) section of AWS.
- Navigate to IAM, select the desired user, go to the "Security credentials" tab, and create or manage access keys.
S3 Bucket Dataset Folder Path:
- Navigate to your bucket in the S3 console and note the specific folder path you wish to sync.
S3 Bucket Name:
- This is the name of your S3 bucket, visible in the S3 dashboard of the AWS Management Console.
Table Level:
- Determine whether your data should be interpreted at the folder level or file level based on your S3 bucket structure and data organization.