[AWS] AWS CHEAT SHEET

AWS CHEAT SHEET

Reference : AWS CLI

Table of Contents

Athena

Partition Projections ### DROP #### Drop Table ```sql ALTER TABLE DROP IF EXISTS PARTITION(year='yyyy', month='MM', day='dd') ``` ### SHOW #### Show Create Table ```sql SHOW CREATE TABLE ``` 👉🏻 This shows table, and configuration info. #### Show Partitions ```sql SHOW PARTITIONS ``` 👉🏻 This shows whole partitioned data. [↑ return to TOC](#table-of-contents) #### REPAIR ##### Repair Manually If your data looks like this, `s3://bucketName/path/distributionID/yyyy/MM/dd/hh` than ```sql ALTER TABLE . ADD PARTITION (year='yyyy',month='MM', day='dd') LOCATION 's3://bucketName/path/distributionID/yyyy/MM/dd/hh' ``` ```sql # example code ALTER TABLE default.cloudfront-test ADD PARTITION (year='2020',month='10', day='05') LOCATION 's3://cloudfront-test/logs/abcdeabcded/2020/10/05/00' ``` ##### Repair Automatically If your data looks like this, `s3://bucketName/path/distributionID/year=2020/month=10/day=05/hour=00` than ```sql MSCK REPAIR TABLE ; ``` [↑ return to TOC](#table-of-contents) </details>
CLI Commands ### LIST #### List Data Catalogs `aws athena --list-data-catalogs` > example of output: ``` { "DataCatalogsSummary": [ { "CatalogName": "AwsDataCatalog", "Type": "GLUE" } ] } ``` **DataCatalogsSummary** ⇒ A summary list of data catalogs **CatalogName** ⇒ The name of the data catalog **Type** ⇒ The data catalog type. [↑ return to TOC](#table-of-contents) #### List Database `aws athena list-databases --catalog-name ` * NOTE : Case-insensitive > example : `aws athena list-databases --catalog-name AwsDataCatalog` ```{ "DatabaseList": [ { "Name": "local", "Description": "Database for Local Shop" }, { "Name": "global", "Description": "Database for Outside of USA Shop", "Parameters": { "CreatedBy": "Athena", "EXTERNAL": "TRUE" } } ] } ``` [↑ return to TOC](#table-of-contents) #### List Table Metadata `aws athena list-table-metadata --catalog-name --database-name ` * w/ **--max-items** option `aws athena list-table-metadata --catalog-name --database-name --max-items ` > example of : `aws athena list-table-metadata --catalog-name AwsDataCatalog --database-name geography --max-items 2` ``` { "TableMetadataList": [ { "Name": "country_codes", "CreateTime": 1586553454.0, "TableType": "EXTERNAL_TABLE", "Columns": [ { "Name": "country", "Type": "string", "Comment": "geo id" }, { "Name": "alpha-2 code", "Type": "string", "Comment": "geo id2" }, { "Name": "alpha-3 code", "Type": "string", "Comment": "state name" }, { "Name": "numeric code", "Type": "bigint", "Comment": "" }, { "Name": "latitude", "Type": "bigint", "Comment": "location (latitude)" }, { "Name": "longitude", "Type": "bigint", "Comment": "location (longitude)" } ], "Parameters": { "areColumnsQuoted": "false", "classification": "csv", "columnsOrdered": "true", "delimiter": ",", "has_encrypted_data": "false", "inputformat": "org.apache.hadoop.mapred.TextInputFormat", "location": "s3://awsdoc-example-bucket/csv/countrycode", "outputformat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "serde.param.field.delim": ",", "serde.serialization.lib": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "skip.header.line.count": "1", "typeOfData": "file" } }, { "Name": "county_populations", "CreateTime": 1586553446.0, "TableType": "EXTERNAL_TABLE", "Columns": [ { "Name": "id", "Type": "string", "Comment": "geo id" }, { "Name": "country", "Name": "id2", "Type": "string", "Comment": "geo id2" }, { "Name": "county", "Type": "string", "Comment": "county name" }, { "Name": "state", "Type": "string", "Comment": "state name" }, { "Name": "population estimate 2018", "Type": "string", "Comment": "" } ], "Parameters": { "areColumnsQuoted": "false", "classification": "csv", "columnsOrdered": "true", "delimiter": ",", "has_encrypted_data": "false", "inputformat": "org.apache.hadoop.mapred.TextInputFormat", "location": "s3://awsdoc-example-bucket/csv/CountyPopulation", "outputformat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat", "serde.param.field.delim": ",", "serde.serialization.lib": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe", "skip.header.line.count": "1", "typeOfData": "file" } } ], "NextToken": "eyJOZXh0VG9rZW4iOiBudWxsLCAiYm90b190cnVuY2F0ZV9hbW91bnQiOiAyfQ==" } ``` [↑ return to TOC](#table-of-contents) #### List Query Execution `aws athena list-query-executions` ``` { "QueryExecutionIds": [ "XXXXXXXX-1a00-XXXX-a348-XXXXXXXXXXXX", "XXXXXXXX-6f9c-XXXX-b0b6-XXXXXXXXXXXX", "XXXXXXXX-2942-XXXX-82d1-XXXXXXXXXXXX", "XXXXXXXX-f2e0-XXXX-8670-XXXXXXXXXXXX", "XXXXXXXX-e6f4-XXXX-9f41-XXXXXXXXXXXX", . . . ] } ``` **QueryExecutionIds** ⇒ The unique IDs of each query execution. * w/ **--max-items** option `aws athena list-query-executions --max-items ` > example : `aws athena list-query-executions --max-items 3` ``` { "QueryExecutionIds": [ "XXXXXXXX-1a00-XXXX-a348-XXXXXXXXXXXX", "XXXXXXXX-6f9c-XXXX-b0b6-XXXXXXXXXXXX", "XXXXXXXX-2942-XXXX-82d1-XXXXXXXXXXXX", ] “NextToken”: “eyJOZXXXXXXXXXXXXXfQ==” } ``` **NextToken** ⇒ A token to be used by the next reuqest if this request is truncacted. [↑ return to TOC](#table-of-contents) ### GET #### Get Query Results `aws athena get-query-execution --query-execution-id ` > example : `aws athena-get-query-exectution --query-execution-id XXXXXXXX-1a00-XXXX-a348-XXXXXXXXXXXX` ``` { "QueryExecution": { "QueryExecutionId": "XXXXXXXX-1a00-XXXX-a348-XXXXXXXXXXXX", "Query": "SELECT *\nFROM local.bluebottle\nWHERE dt=\"date\"('2020-10-15')", "StatementType": "DML", "ResultConfiguration": { "OutputLocation": "s3://bluebottle/shop/2020/10/15/XXXXXXXX-1a00-XXXX-a348-XXXXXXXXXXXX.csv" }, "QueryExecutionContext": {}, "Status": { "State": "SUCCEEDED", "SubmissionDateTime": "2020-10-16T14:13:10.355000+09:00", "CompletionDateTime": "2020-10-16T14:13:40.378000+09:00" }, "Statistics": { "EngineExecutionTimeInMillis": 29787, "DataScannedInBytes": 216541413, "TotalExecutionTimeInMillis": 30023, "QueryQueueTimeInMillis": 154, "QueryPlanningTimeInMillis": 728, "ServiceProcessingTimeInMillis": 82 }, "WorkGroup": "primary" } } ``` [↑ return to TOC](#table-of-contents) </details> ## S3
CLI Commands ### LIST #### List Buckets `aws s3 ls ` > example : `aws s3 ls bluebottle` ``` 2020-10-05 17:08:50 mybucketA 2020-10-06 14:55:44 mybucketB ``` * w/ **--profile** option `aws --profile s3 ls ` * w/ **--human-readable** option `aws s3 ls s3://bucketName/path --human-readable` Displays the size of the obejcts in human readable format. * w/ **--recursive** option `aws s3 ls s3://bucketName --recursive` Displays all files include sub-directories. > example : `aws s3 ls s3://bluebottle --recursive` ``` 2020-09-24 12:45:12 1364 path/2020/10/09/abcd.metadata ``` 1. **First column** : timestamp 1. **Second column** : object size 1. **Third column** : object name `aws s3api list-buckets` > example of output : ``` { "Buckets": [ { "Name": "mybucketA" "CreationDate": "2020-10-05T17:08:50.000Z", }, { "Name": "mybucketB" "CreationDate": "2020-10-06T14:55:44.000Z", }, { "Name": "mybucketC" "CreationDate": "2020-01-01T23:32:05+00:00", }, ], "Owner": { "DisplayName": "userName", "ID": "userID" }, } ``` 👉🏻  Buckets are returned in alphabetical order. [↑ return to TOC](#table-of-contents) #### List Objects `aws s3 ls s3:////` > example : `aws s3 ls s3://bluebottle/shop/` ``` PRE 2002/ PRE 2003/ PRE 2004/ PRE 2005/ PRE 2006/ PRE 2007/ PRE 2008/ PRE 2009/ PRE 2010/ PRE 2011/ PRE 2012/ PRE 2013/ PRE 2014/ PRE 2015/ PRE 2016/ PRE 2017/ PRE 2018/ PRE 2019/ PRE 2020/ ``` > **PRE** stands for **Pre**fix of ann S3 object. * w/ **--profile** option `aws --profile s3 ls ` `aws s3api list-objects --bucket ` * w/ **--max-items** option `aws s3api list-objects --bucket --max-items ` The total number of items to return. ### GET #### Get Bucket Location `aws s3api get-bucket-location --bucket ` * w/ **--profile** option `aws --profile s3api get-bucket-location --bucket ` ### REMOVE #### Remove Multiple Files Remove multiple files by `--exclude` and `--include` arguments `aws s3 rm s3://bucketName/path --recursive --exclude "*" --include "pattern*"` ``` # example code aws s3 rm s3://cloudfront-test/logs --recursive --exclude "*" --include "abcdefghijklmnopqr/2020/10/04/*" ``` * **Results** ![remove-multiple-objects](https://user-images.githubusercontent.com/48475824/95204315-e8a37680-081e-11eb-8361-873b5a800fa7.png) [↑ return to TOC](#table-of-contents) </detail>