The following links will help you install Amazon Genomics CLI and quickly run a demo workflow.
This is the multi-page printable view of this section. Click here to print.
Getting Started
- 1: Prerequisites
- 2: Installation
- 3: Setup
- 4: Hello world
1 - Prerequisites
To run Amazon Genomics CLI the following prerequisites must be met:
- A computer with one of the following operating systems:
- macOS 10.14+
- Amazon Linux 2
- Ubuntu 20.04
- Windows 10 with a Windows subsystem running Ubuntu which runs the commands
- Internet access
- An AWS Account
- An AWS role with sufficient access. To generate the minimum required policies for admins and users, please follow the instructions here
Running Amazon Genomics CLI on Windows has not been tested, but it should run in WSL 2 with Ubuntu 20.04
Prerequisite installation
Ubuntu 20.04
- Install node.js
curl -fsSL https://deb.nodesource.com/setup_15.x | sudo -E bash -
sudo apt-get install -y nodejs
- Install and configure AWS CLI
sudo apt install awscli
aws configure
# ... set access key ID, secret access key, and region
Amazon Linux 2 (e.g. on an EC2 instance)
- Install node
curl -sL https://rpm.nodesource.com/setup_16.x | sudo -E bash -
sudo yum install -y nodejs
- If you have not already done so, configure your AWS credentials and default region
aws configure
MacOS
- Install Homebrew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
- Install node
brew install node
- Install and configure AWS CLI
brew install awscli
aws configure
# ... set access key ID, secret access key, and region
2 - Installation
Download and install Amazon Genomics CLI
Download the Amazon Genomics CLI zip, unzip its contents, and run the install.sh
script:
To download a specific release, see releases page of our Github repo.
To download the latest release navigate to https://github.com/aws/amazon-genomics-cli/releases/
Once you have downloaded a release, type the following to install:
The latest nightly build can be accessed here: s3://healthai-public-assets-us-east-1/amazon-genomics-cli/nightly-build/amazon-genomics-cli.zip
You can download the nightly by running the following:
aws s3api get-object --bucket healthai-public-assets-us-east-1 --key amazon-genomics-cli/nightly-build/amazon-genomics-cli.zip amazon-genomics-cli.zip
unzip amazon-genomics-cli-<version>.zip
cd amazon-genomics-cli/
./install.sh
This will place the agc
command in $HOME/bin
.
The Amazon Genomics CLI is a statically compiled Go binary. It should run in your environment natively without any additional setup. Test the CLI with:
$ agc --help
🧬 Launch and manage genomics workloads on AWS.
Commands
Getting Started 🌱
account Commands for AWS account setup.
Install or remove AGC from your account.
Contexts
context Commands for contexts.
Contexts specify workflow engines and computational fleets to use when running a workflow.
Logs
logs Commands for various logs.
Projects
project Commands to interact with projects.
Workflows
workflow Commands for workflows.
Workflows are potentially-dynamic graphs of computational tasks to execute.
Settings ⚙️
configure Commands for configuration.
Configuration is stored per user.
Flags
--format string Format option for output. Valid options are: text, table, json (default "text")
-h, --help help for agc
--silent Suppresses all diagnostic information.
-v, --verbose Display verbose diagnostic information.
--version version for agc
Examples
Displays the help menu for the specified sub-command.
`$ agc account --help`
If this doesn’t work immediately, try:
- start a new terminal shell
- modifying your
$HOME/.bashrc
(or equivalent file) appending the following line and restarting your shell:
export PATH=$HOME/bin:$PATH
If you are running this on MacOS, you may see this below popup window when you initially run any agc commands due to Apple’s security restrictions.
Click Cancel and navigate to Apple’s System Preferences, click Security & Privacy, then click General. Near the bottom, you will see a line indicating "agc" was blocked from use because it is not from an identified developer.
To the right, click Allow Anyway.
Now go back to the terminal and run agc --help
again. You will see this new popup window below asking you to override the system security.
Click Open and now your agc
is correctly installed.
Verify that you have the latest version of Amazon Genomics CLI with:
agc --version
If you do not, you may need to uninstall any previous versions of Amazon Genomics CLI and reinstall the latest.
Command Completion
Amazon Genomics CLI can generate shell completion scripts that enable ‘Tab’ completion of commands. Command completion is optional and not required to use Amazon Genomics CLI. To generate a completion script you can use:
agc completion <shell>
where “shell” is one of:
Bash
source <(agc completion bash)
To load completions for each session, execute once:
Linux:
agc completion bash > /etc/bash_completion.d/agc
macOS:
If you haven’t already installed bash-completion
, execute the following once
brew install bash-completion
and then, add the following line to your ~/.bash_profile:
[[ -r "/usr/local/etc/profile.d/bash_completion.sh" ]] && . "/usr/local/etc/profile.d/bash_completion.sh"
Once bash completion is installed
agc completion bash > /usr/local/etc/bash_completion.d/agc
Zsh:
If shell completion is not already enabled in your environment, you will need to enable it. You can execute the following once:
echo "autoload -U compinit; compinit" >> ~/.zshrc
To load completions for each session, execute once:
agc completion zsh > "${fpath[1]}/_agc"
You will need to start a new shell for this setup to take effect.
fish:
agc completion fish | source
To load completions for each session, execute once:
agc completion fish > ~/.config/fish/completions/agc.fish
PowerShell:
agc completion powershell | Out-String | Invoke-Expression
To load completions for every new session, run:
agc completion powershell > agc.ps1
and source this file from your PowerShell profile.
3 - Setup
Account activation
To start using Amazon Genomics CLI with your AWS account, you need to activate it.
agc account activate
This will create the core infrastructure that Amazon Genomics CLI needs to operate, which includes a DynamoDB table, an S3 bucket and a VPC. This will take ~5min to complete. You only need to do this once per account region.
The DynamoDB table is used by the CLI for persistent state. The S3 bucket is used for durable workflow data and Amazon Genomics CLI metadata and the VPC is used to isolate compute resources. You can specify your own preexisting S3 Bucket or VPC if needed using --bucket
and --vpc
options.
CDK Bootstrap
Attention
This step is NOT required when using Amazon Genomics CLI version 1.2 or aboveAmazon Genomics CLI uses AWS CDK to deploy infrastructure. Activating an account will bootstrap the AWS Environment for CDK app deployments. CDK Bootstrap deploys the infrastructure needed to allow CDK to deploy CDK defined infrastructure. Full details are available here.
Define a username
Amazon Genomics CLI requires that you define a username and email. You can do this using the following command:
agc configure email you@youremail.com
The username only needs to be configured once per computer that you use Amazon Genomics CLI from.
4 - Hello world
When you install Amazon Genomics CLI it will create a folder named agc
. Inside there is an examples/demo-project
folder containing an agc-project.yaml
with some demo projects including a simple “hello world” workflow.
Running Hello World
- Ensure you have met the prerequisites and installed Amazon Genomics CLI
- Ensure you have followed the activation steps
cd ~/amazon-genomics-cli/examples/demo-wdl-project
agc context deploy --context myContext
, this step takes approximately 5 minutes to deploy the infrastructureagc workflow run hello --context myContext
, take note of the returned workflow instance ID.- Check on the status of the workflow
agc workflow status -r <workflow-instance-id>
. Initially you will see status likeSUBMITTED
but after the elastic compute resources have been spun up and the workflow runs you should see something like the following:WORKFLOWINSTANCE myContext 9ff7600a-6d6e-4bda-9ab6-c615f5d90734 COMPLETE 2021-09-01T20:17:49Z
Congratulations! You have just run your first workflow in the cloud using Amazon Genomics CLI! At this point you can run additional workflows, including submitting several instances of the “hello world” workflow. The elastic compute resources will expand and contract as necessary to accommodate the backlog of submitted workflows.
Reviewing the Results
Workflow results are written to an S3 bucket specified or created by Amazon Genomics CLI during account activation. You can list or retrieve the S3 URI for the bucket with:
AGC_BUCKET=$(aws ssm get-parameter \
--name /agc/_common/bucket \
--query 'Parameter.Value' \
--output text)
and then use aws s3
commands to explore and retrieve data from the bucket. Workflow output will be in the
s3://agc-<account-num>-<region>/project/<project-name>/userid/<user-id>/context/<context-name>/workflow/<workflow-name>/
path. The rest of the path depends on the engine used to run the workflow. For Cromwell it will continue with:
.../cromwell-execution/<wdl-wf-name>/<workflow-run-id>/<task-name>
If a workflow declares outputs then you may obtain these using the command:
agc workflow output <workflow_run_id>
You should see a response similar to:
OUTPUT id 6cc6f742-dc87-4649-b319-1af45c4c09c6
OUTPUT outputs.hello_agc.hello.out Hello Amazon Genomics CLI!
You can also obtain task logs for a workflow using the following form agc logs workflow <workflow-name> -r <instance-id>
.
Note, if the workflow did not actually run any tasks due to call caching then there will be no output from this command.
Cleaning Up
Once you are done with myContext
you can destroy it with:
agc context destroy myContext
This will remove the cloud resources associated with the named context, but will keep any S3 outputs and CloudWatch logs.
If you want stop using Amazon Genomics CLI in your AWS account entirely, you need to deactivate it:
agc account deactivate
This will remove Amazon Genomics CLI’s core infrastructure. If Amazon Genomics CLI created a VPC as part of the activate process, it will be removed. If Amazon Genomics CLI created an S3 bucket for you, it will be retained.
To uninstall Amazon Genomics CLI from your local machine, run the following command:
./agc/uninstall.sh
Note uninstalling the CLI will not remove any resources or persistent data from your AWS account.
Next Steps
- Familiarize yourself with Amazon Genomics CLI Concepts
- Try some tutorials