# AWS Glue code examples for the SDK for Ruby ## Overview Shows how to use the AWS SDK for Ruby to work with AWS Glue. *AWS Glue is a scalable, serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development.* ## ⚠ Important * Running this code might result in charges to your AWS account. * Running the tests might result in charges to your AWS account. * We recommend that you grant your code least privilege. At most, grant only the minimum permissions required to perform the task. For more information, see [Grant least privilege](https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html#grant-least-privilege). * This code is not tested in every AWS Region. For more information, see [AWS Regional Services](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services). ## Code examples ### Prerequisites For prerequisites, see the [README](../../README.md#Prerequisites) in the `ruby` folder. ### Single actions Code excerpts that show you how to call individual service functions. * [Create a crawler](glue_wrapper.rb#L34) (`CreateCrawler`) * [Create a job definition](glue_wrapper.rb#L116) (`CreateJob`) * [Delete a crawler](glue_wrapper.rb#L75) (`DeleteCrawler`) * [Delete a database from the Data Catalog](None) (`DeleteDatabase`) * [Delete a job definition](glue_wrapper.rb#L203) (`DeleteJob`) * [Delete a table from a database](glue_wrapper.rb#L215) (`DeleteTable`) * [Get a crawler](glue_wrapper.rb#L18) (`GetCrawler`) * [Get a database from the Data Catalog](glue_wrapper.rb#L88) (`GetDatabase`) * [Get a job run](glue_wrapper.rb#L189) (`GetJobRun`) * [Get runs of a job](glue_wrapper.rb#L178) (`GetJobRuns`) * [Get tables from a database](glue_wrapper.rb#L102) (`GetTables`) * [List job definitions](glue_wrapper.rb#L166) (`ListJobs`) * [Start a crawler](glue_wrapper.rb#L62) (`StartCrawler`) * [Start a job run](glue_wrapper.rb#L142) (`StartJobRun`) ### Scenarios Code examples that show you how to accomplish a specific task by calling multiple functions within the same service. * [Get started with crawlers and jobs](glue_wrapper.rb) ## Run the examples ### Instructions #### Get started with crawlers and jobs This example shows you how to do the following: * Create a crawler that crawls a public Amazon S3 bucket and generates a database of CSV-formatted metadata. * List information about databases and tables in your AWS Glue Data Catalog. * Create a job to extract CSV data from the S3 bucket, transform the data, and load JSON-formatted output into another S3 bucket. * List information about job runs, view transformed data, and clean up resources. Before you run the example, you will need to set up an S3 bucket linked with an IAM role. To accomplish this, navigate to the [glue_role_bucket CDK script](../../../resources/cdk/glue_role_bucket) and follow the instructions. After completing setup, you can run the following at a command prompt: ``` ruby glue_wrapper.rb ``` ### Tests ⚠ Running tests might result in charges to your AWS account. To find instructions for running these tests, see the [README](../../README.md#Tests) in the `ruby` folder. ## Additional resources * [SDK for Ruby Developer Guide](https://aws.amazon.com/developer/language/ruby/) * [SDK for Ruby Amazon Glue Module](https://docs.aws.amazon.com/sdk-for-ruby/v3/api/Aws/Glue.html) * [AWS Glue Developer Guide](https://docs.aws.amazon.com/glue/latest/dg/what-is-glue.html) * [AWS Glue API Reference](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api.html) --- Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved. SPDX-License-Identifier: Apache-2.0