Skip to main content
This article covers connecting Unstructured to Amazon S3 Vectors.For information about connecting Unstructured to Amazon S3 without support for Amazon S3 Vectors instead, see S3.
If you’re new to Unstructured, read this note first.Before you can create a destination connector, you must first sign in to your Unstructured account:After you sign in, the Unstructured user interface (UI) appears, which you use to create your destination connector.After you create the destination connector, add it along with a source connector to a workflow. Then run the worklow as a job. To learn how, try out the hands-on UI quickstart or watch the 4-minute video tutorial.You can also create destination connectors with the Unstructured API. Learn how.If you need help, email Unstructured Support at support@unstructured.io.You are now ready to start creating a destination connector! Keep reading to learn how.
Send processed data from Unstructured to Amazon S3 Vectors. The requirements are as follows.
  • An Amazon S3 Vectors bucket.
  • The AWS Region (such as us-east-1) of the target S3 Vectors bucket. Learn how to get the Region of an existing S3 Vectors bucket.
  • An index for the target S3 Vectors bucket. When creating an index, be sure to specify these settings:
    • Vector index name can be any allowed name pattern.
    • For Dimension, only specify a number that is supported by Unstructured’s available embedding models.
    • For Distance metric, only specify Cosine.
    • For Metadata configuration under Additional settings, Unstructured recommends that you specify the following 10 keys for Non-filterable metadata:
      • text
      • link_urls
      • link_texts
      • coordinates-points
      • coordinates-system
      • data_source-url
      • data_source-record_locator
      • data_source-date_created
      • data_source-date_modified
      • data_source-date_processed
    • There are no Unstructured-specific requirements for Encryption or Tags.
    Learn more about these index settings.
  • For the target index, the number of dimensions that are generated. Learn how to get the index’s number of dimensions.
  • The AWS access key ID and the AWS secret access key for the target AWS IAM principal (such as an IAM user or group) that has the appropriate access to the S3 Vectors bucket.
    • If you use identity-based policies to control access, the target IAM principal must have at minimum the following access permissions. Replace the following placeholders:
      • Replace <region-short-id> with the AWS Region short ID of the target S3 Vectors bucket.
      • Replace <account-id> with the AWS account ID of the target S3 Vectors bucket.
      • Replace <bucket-name> with the name of the target S3 Vectors bucket.
      • Replace <index-name> with the name of the target index.
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Sid": "AccountBucketListing",
                  "Effect": "Allow",
                  "Action": [
                      "s3vectors:ListVectorBuckets"
                  ],
                  "Resource": "*"
              },
              {
                  "Sid": "AllowBucketAccess",
                  "Effect": "Allow",
                  "Action": [
                      "s3vectors:GetVectorBucket",
                      "s3vectors:ListIndexes"
                  ],
                  "Resource": "arn:aws:s3vectors:<region-short-id>:<account-id>:bucket/<bucket-name>"
              },
              {
                  "Sid": "AllowIndexAccess",
                  "Effect": "Allow",
                  "Action": [
                      "s3vectors:ListIndexes",
                      "s3vectors:GetIndex",
                      "s3vectors:ListVectors",
                      "s3vectors:QueryVectors",
                      "s3vectors:PutVectors",
                      "s3vectors:GetVectors",
                      "s3vectors:DeleteVectors"
                  ],
                  "Resource": "arn:aws:s3vectors:<region-short-id>:<account-id>:bucket/<bucket-name>/index/<vector-name>"
              }
          ]
      }
      
      Learn more about these S3 Vectors access permissions.
    • Learn how to attach an access policy to an IAM user, group, or role.
    • Learn how to create and manage AWS access key IDs and their related AWS secret access keys for IAM users.
    • Learn how to switch from an IAM user to a role for temporary access.

Create the destination connector

To create the destination connector:
  1. On the sidebar, click Connectors.
  2. Click Destinations.
  3. Click New or Create Connector.
  4. Give the connector some unique Name.
  5. In the Provider area, click Amazon S3 Vectors.
  6. Click Continue.
  7. Follow the on-screen instructions to fill in the fields as described later on this page.
  8. Click Save and Test.
Fill in the following fields:
  • Name (required): A unique name for this connector.
  • Region (required): The AWS Region (such as us-east-1) of the target Amazon S3 Vectors bucket.
  • Key (required): The AWS access key ID for the target AWS IAM principal that has the appropriate access to the target bucket.
  • Secret (required): The AWS secret access key for the corresponding AWS access key ID.
  • Vector Bucket Name (required): The name of the target bucket.
  • Index Name (required): The name of the target index in the bucket.
  • Batch Size: The maximum number of vectors to generate a single batch. The maximum is 500. The default is 100 if not otherwise specified.
  • Key Prefix: Some string to prepend to each vector key. The default is to not prepend a string to each vector key, if this value is not otherwise specified.