ADToolbox Commandline Interface
Initialization
First, we need to initialize the CLI: After installing ADToolbox, type and execute the following in your terminal to initialize the base directory for ADToolbox files:
ADToolbox
You should see the following if you are running this for the first time:
No Base Directory Found:
Where do you want to store your ADToolbox Data?:
Type the absolute path directory of interest. Don't worry if you mess this part up. You can change this later as well. you can type '.' for now and change this later.
You can access all the commands along with their brief explanation by:
ADToolbox --help
ADToolbox Modules
This toolbox is comprised of different modules:
-
Configs Module
-
Database Module
-
Metagenomics Module
-
ADM Module
-
Documentations Module
1. Configs Module
After installation, the base working directory must be specified:
ADToolbox Configs --help
────────────────────────────────── ADToolBox───────────────────────────────────
usage: ADToolBox Configs [-h] [-s SET_BASE_DIR] [-g] [--get-base-dir]
optional arguments:
-h, --help show this help message and exit
-s SET_BASE_DIR, --set-base-dir SET_BASE_DIR
Set the base directory for ADToolBox to work with
-g, --get-base-dir Get the current base directory for ADToolBox
This will give you a list of all functionalities that are related to the configurations of the toolbox. Here we go one by one in the correct order:
- set-base-dir: The first configuration command will allow you to set the base directory for ADToolbox to work. This could be an existing folder somewhere in your files or a directory that you are willing to create. If the directory does not already exit, it will be automatically created after this command. For example if I want to set the base directory to be ADToolbox directory on my desktop the command would be, in MacOS, something like this:
ADToolbox Configs --set-base-dir ~/Desktop/ADToolbox
Anything that you will do from now on, will be saved in this directory.
2. Database Module
Any database that is used by ADToolbox can be modified from this module. Type the following in your commandline to find all of the database module's commands:
ADToolbox Database --help
──────────────────────────── ADToolBox ────────────────────────────
usage: ADToolBox Database [-h]
{initialize-feed-db,add-feed,sh
ow-feed-db,initialize-metagenomics-studies-db,add-metagen
omics-study,initialize-protein-db,add-protein,download-re
action-db,download-seed-reaction-db,build-protein-db,down
load-protein-db,download-amplicon-to-genome-dbs,download-
all-databases}
...
positional arguments:
{initialize-feed-db,add-feed,show-feed-db,initialize-me
tagenomics-studies-db,add-metagenomics-study,initialize-p
rotein-db,add-protein,download-reaction-db,download-seed-
reaction-db,build-protein-db,download-protein-db,download
-amplicon-to-genome-dbs,download-all-databases}
Database commands:
initialize-feed-db Initialize the Feed DB
add-feed Add a feed to the feed database
show-feed-db Shows the feed database
initialize-metagenomics-studies-db
Initialize the Metagenomics Studies DB
add-metagenomics-study
Add a metagenomics study to the Kbase
initialize-protein-db
Generates the protein database for ADToolbox
add-protein Add a protein to the protein database
download-reaction-db
Downloads the reaction database in CSV
format
download-seed-reaction-db
Downloads the seed reaction database in
JSON format
build-protein-db Generates the protein database for
ADToolbox
download-protein-db
Downloads the protein database in fasta
format; You can alternatively build it
from reaction database.
download-amplicon-to-genome-dbs
downloads amplicon to genome databases
download-all-databases
downloads all databases that are required by ADToolbox at once
options:
-h, --help show this help message and exit
We will now go over these commands one by one:
- initialize-feed-db: This will create an empty JSON file in the Database sub-directory in your base directory that will hold all the future feed information that you add. You can run this command by:
ADToolbox Database initialize-feed-db
- add-feed: This will add feed data to the database. Such data includes: the name of the feed (-n, --name), carbohydrate content of the feed in a percentage (-c, --carbohydrates), protein content of the feed in a percecntage (-p, --proteins), lipid content of the feed in a percentage (-l, --lipids), total suspended solid content of the feed in a percentage (-t, --tss), soluable inert content of feed in a percentage (-s, --si), particulate inert content of feed in a percentage (-x, --xi), and a reference where numbers came from (-r, --reference). This command is run by:
ADToolbox Database add-feed
An example of this would look like:
ADToolbox Database add-feed -n "test feed" -c 20 -p 20 -l 20 -t 20 -s 20 -x 20 -r "test reference"
- show-feed-db: As the name implies, this will show the user the feed database along with any values they have added to it, in the command window. This command is run by:
ADToolbox Database show-feed-db
- initialize-metagenomics-studies-db: This will create an empty TSV file in the Database sub-directory in your base directory that will hold all the future information about various metagenomics studies that you add. You can run this command by:
ADToolbox Database initialize-metagenomics-studies-db
- add-metagenomics-study: This command will add a metagenomics study to the Kbase and will require the study name (-n,--name), study type (-t, --type), microbiome where the metagenomics study belongs to (-m, --microbiome), SRA accession ID for the sample (-s, --sample_accesion), SRA accession ID for the project (-p, --study_accesion), and comments on the study of interest (-c, --comments). This command is run by:
ADToolbox Database add-metagenomics-study
An example of this would look like:
ADToolbox Database add-metagenomics-study -n test_study -t 16s -m "anaerobic digestion" -s 11111111 -c "this is just a test" -p 222222
- initialize-protein-db: This will create an empty JSON file in the Database sub-directory in your base directory that will hold all the future protein information that you add. You can run this command by:
ADToolbox Database initialize-protein-db
- add-protein: As the name implies, this will add information about a protein to the empty protein database. Information about such protein includes its UniProt ID (-i, --uniprot-id), and the name attached to the protein which is usually the EC number (-n, --name). You can run this command by:
ADToolbox Database add-protein
An example of this would look like:
ADToolbox Database add-protein -i ATEST1 -n 1.1.1.1
NOTE: Skip the following download commands if you have run ADToolbox Configs download-all-databases
- download-reaction-db: As the name implies, this will download the ADToolbox reaction database. This is required for many important modules of the toolbox
ADToolbox Database download-reaction-db
- download-protein-db: Downloads the protein database in fasta format; You can alternatively build it from reaction database if you have downloaded it; Check below.
ADToolbox Database download-protein-db
- build-protein-db: Generates the protein database for ADToolbox from the reaction database:
ADToolbox Database build-protein-db
- download-amplicon-to-genome-dbs: If you need to use the 16s mapping to the protein database and ADM, you will need to download the required databases using this command:
ADToolbox Database download-amplicon-to-genome-dbs
- download-seed-reaction-db: This will download the SEED reaction database in JSON format.
ADToolbox Database download-seed-reaction-db
3. Metagenomics Module
Metagenomics module of ADToolbox is designed to input metagenomics data into consideration when designing an AD process.
You can observe all the functionalities by:
ADToolbox Metagenomics --help
──────────────────────────── ADToolBox ────────────────────────────
usage: ADToolBox Metagenomics [-h]
{download_from_sra}
...
positional arguments:
{download_from_sra,download_genome}
download_from_sra This module provides a command line interface to download
metagenomics data from SRA
download_genome This module provides a command line interface to download
genomes from NCBI
align-genome Align genomes to the protein database
of ADToolbox, or any other fasta with
protein sequences
align-multiple-genomes
Align multiple Genomes to the protein
database of ADToolbox, or any other
fasta with protein sequences
find-representative-genomes
Finds representative genomes from the
repseqs fasta file
options:
-h, --help show this help message and exit
- download_from_sra: This command takes a sample accesion ID (-s, --sample_accesion) for a sample, downloads it, and places it into a directory provided by the you (-o, --output-dir). It also requires you to state a container you are using. If you are downloading locally, put "None". Otherwise, you can use the containers docker or singularity. You can run this command by:
ADToolbox Metagenomics download_from_sra
An example of this command would look like:
ADToolbox Metagenomics download_from_sra -s SRR28403133 -o OUTPUT/DIRECTORY/PATHNAME -c None
- download_genome: This command requires you to provide a NCBI accesion ID for a genome (-g, --genome_accesion), and output directory (-o,--output-dir), and a container (-c, --container). It will then take the NCBI accesion ID for a genome and download it in the directory provided by you. If you are downloading locally, put "None" as your container option. Otherwise, you can use the containers docker or singularity. You can run this command by:
ADToolbox Metagenomics download_genome
An example of this command would look like:
ADToolbox Metagenomics download_genome -g GCA021152825.1 -o OUTPUT/DIRECTORY/PATHNAME -c None
- align-genome: This command requires that you to give a name for the genome (-n,--name),the address of the JSON file that includes information about the genome to be aligned (-i,--input-file), and output directory to store alignment results (-o,--output-dir), a container to use for the alignment (-c,--container), and the directory containing the protein database to be used for the alignment (-d, --protein-db-dir). If you are downloading locally, put "None" as your container option. Otherwise, you can use the containers docker or singularity. Overall, this command takes a genome and aligns it to a protein sequence. You can run this command by:
ADToolBox Metagenomics align-genome
An example of this code would look like:
ADToolbox Metagenomics align-genome -n "test genome" -i INPUT/PATHNAME/OF/GENOME -o OUTPUT/PATHNAME/DIRECTORY -c None -d PATHNAME/OF/PROTEIN
- align-multiple-genomes: This command allows you to align multiple genomes to the protein database of ADToolbox, or any other fasta file with protein sequences. It requires to user to input the address to a JSON file that holds the information about the genomes (-i,--input-file), an output directory to store the alignment results (-o,--output-dir), a container to use for the alignment (-c,--container), and the directory containing the protein database to be used for alignment (-d,--protein-db-dir). If you are downloading locally, put "None" as your container option. Otherwise, you can use the containers docker or singularity. This command can be run by:
ADToolbox Metagenomics align-multiple-genomes
An example of this command looks like:
ADToolbox Metagenomics align-multiple-genomes -i PATHNAME/TO/FILE/OF/GENOMES -o OUTPUT/DIRECTORY -c None -d DIRECTORY/OF/PROTEIN/DATABSE
- find-represenative-genomes: This command maps represenative amplicon sequences to a representative genome in GTDB database. It requires the user to provide the address to the repseqs fasta file (-i,--input-file), the directory of the output file (-o, --output-dir), a container used for the alignment (-c,--container), and the format of the output file which can be json or csv (-f,--format). Something optional that you can provide is the similarity cutoff for clustering; though, the default is 0.97 (-s,--similarity). If you are downloading locally, put "None" as your container option. Otherwise, you can use the containers docker or singularity. You can run this code by:
ADToolbox Metagenomics find-representative-genomes
An example of this code will look like:
ADToolbox Metagenomics find-representative-genomes -i PATHNAME/TO/REPSEQS/FASTA/FILE -o PATHNAME/TO/OUTPUT/DIRECTORY -c None -f csv
4. ADM Module
ADM module provides all the tools needed to run instances of ADM Model. This includes the original ADM, Batstone et al., and the Modified-ADM suggested by the Authors of ADToolbox. In order to find out about all the functionalities in this module, you can run:
ADToolbox ADM --help
────────────────────────────────── ADToolBox ───────────────────────────────────
usage: ADToolBox ADM [-h] {original-adm1,modified-adm,show-escher-map} ...
positional arguments:
{original-adm1,modified-adm,show-escher-map}
Available ADM Commands:
original-adm1 Original ADM1:
modified-adm Modified ADM:
options:
-h, --help show this help message and exit
- original-adm1: If you want to run the original ADM, batstone et al, in your browser you can run this command with the required parameters in JSON format:
ADToolbox ADM original-adm1 --help
────────────────────────────────── ADToolBox ───────────────────────────────────
usage: ADToolBox ADM original-adm1 [-h] [--model-parameters MODEL_PARAMETERS]
[--base-parameters BASE_PARAMETERS]
[--initial-conditions INITIAL_CONDITIONS]
[--inlet-conditions INLET_CONDITIONS]
[--reactions REACTIONS] [--species SPECIES]
[--metagenome-report METAGENOME_REPORT]
[--report REPORT]
options:
-h, --help show this help message and exit
--model-parameters MODEL_PARAMETERS
Model parameters for ADM 1
--base-parameters BASE_PARAMETERS
Provide json file with base parameters for original
ADM1
--initial-conditions INITIAL_CONDITIONS
Provide json file with initial conditions for original
ADM1
--inlet-conditions INLET_CONDITIONS
Provide json file with inlet conditions for original
ADM1
--reactions REACTIONS
Provide json file with reactions for original ADM1
--species SPECIES Provide json file with species for original ADM1
--metagenome-report METAGENOME_REPORT
Provide json file with metagenome report for original
ADM1
--report REPORT Describe how to report the results of original ADM1.
Current options are: 'dash' and 'csv'
Every argument is optional, and their role is clear from the comments in front of them, So we just provide a full example of this command:
ADToolbox ADM original-adm1 \
--model-parameters ~/Desktop/Model_Parameters.json \
--base-parameters ~/Desktop/Base_Parameters.json \
--initial-conditions ~/Desktop/Initial_Conditions.json \
--inlet-conditions ~/Desktop/Inlet-Conditions.json \
--reactions ~/Desktop/Reactions.json
--species ~/Desktop/Species.json
--metagenome-report ~/Desktop/ADM_Mapping_Report.json
--repor dash
NOTE if you choose dash for your report, the CLI will prompt you to open your browser in the instructed address, if you choose csv, it will generate a CSV file that includes concentration profiles simulated over time.
- modified-adm: This command is exactly similar to the previous one, except that it requires parameters taylored for modified ADM:
ADToolbox ADM modified-adm --help
────────────────────────────────── ADToolBox ───────────────────────────────────
usage: ADToolBox ADM modified-adm [-h] [--model-parameters MODEL_PARAMETERS]
[--base-parameters BASE_PARAMETERS]
[--initial-conditions INITIAL_CONDITIONS]
[--inlet-conditions INLET_CONDITIONS]
[--reactions REACTIONS] [--species SPECIES]
[--metagenome-report METAGENOME_REPORT]
[--report REPORT]
options:
-h, --help show this help message and exit
--model-parameters MODEL_PARAMETERS
Model parameters for Modified ADM
--base-parameters BASE_PARAMETERS
Provide json file with base parameters for modified
ADM
--initial-conditions INITIAL_CONDITIONS
Provide json file with initial conditions for modified
ADM
--inlet-conditions INLET_CONDITIONS
Provide json file with inlet conditions for modified
ADM
--reactions REACTIONS
Provide json file with reactions for modified ADM
--species SPECIES Provide json file with species for modified ADM
--metagenome-report METAGENOME_REPORT
Provide json file with metagenome report for modified
ADM
--report REPORT Describe how to report the results of modified ADM.
Current options are: 'dash' and 'csv'
The usage is exactly the same as the original-adm
- show-escher-map: This command will prompt you to open an escher map for the modified-adm model in your browser with the instructed address:
ADToolbox ADM show-escher-map
5. Documentations Module
You can view the documentaion in your CLI using rich's markdown render. You can do this by:
ADToolbox Documentations --show