===================================== Writing your first Gluepy app, part 4 ===================================== In the previous parts of this tutorial :doc:`/intro/tutorial01`, :doc:`/intro/tutorial02` and :doc:`/intro/tutorial03` we have setup a project that contains a :ref:`dags` that train a machine learning model on some sample data. Next up, we'll talk about the CLI and commands. Gluepy already comes bundled with pre-existing commands that allow you to do basic tasks such as running your :ref:`dags` with the :ref:`cli_dag`, but there may be situations where you want to add functionality or scripts to your project that does not fit into the concept of a :ref:`dags` or :ref:`tasks`. E.g. you may want to write a command that copies a run folder, or a command that takes a trained .pkl model file and deploys it in a registry. In this final step of the tutorial, we will introduce the concept of writing custom :ref:`cli` that copies the output of a previous run to a new location, to simulate a deployment to production. Reviewing the default CLI command ================================= If you recall :doc:`/intro/tutorial01`, when we created out ``forecaster`` module using the ``startmodule`` command, it generated a file at ``forecaster/commands.py`` that looks like this: .. code-block:: python import click from gluepy.commands import cli @cli.command() def sample(): click.echo("Sample command called") What happens here is the following: * The command is using `Click `_ under the hood for logic related to CLI such as adding options, groups of commands, help text and more. * All commands in Gluepy served on ``manage.py`` is part of the ``gluepy.commands.cli`` group. You must add a command to ``gluepy.commands.cli`` using the ``@cli.command()`` operator. This command can be called using: .. code-block:: bash $ python manage.py sample Sample command called Creating a custom CLI command ============================= Now let's modify this ``sample`` command to instead receive a path to a run folder, and copy the .pkl model file that we created in :doc:`/intro/tutorial02` to a ``/data/production`` directory to simulate a deployment. In a real project, you may instead deploy the model to something like `MLFlow `_. .. code-block:: python import os import click from gluepy.commands import cli from gluepy.files.storages import default_storage from gluepy.conf import default_context @cli.command() @click.argument("run_folder") def deploy(run_folder): default_storage.cp( os.path.join(run_folder, "model.pkl"), os.path.join("production", "model.pkl"), ) click.echo("Model deployed to production") The code above defines the following: * Add a new command named ``deploy`` to the ``manage.py`` CLI using the ``@cli.command()`` decorator. * Add a new argument using `Click `_ that expect user to pass a :ref:`context_run_folder` path. * Use ``default_storage`` to copy the file from our run folder, to a centralized folder we use for "production" models. This can now be called in the following manner. .. code-block:: bash $ python manage.py deploy runs/2024/6/25/c29b8b49-dee9-4984-8ccc-860651780054/ Model deployed to production Wrapping up =========== That was it for this tutorial. We have now learned: * How to create new projects * How to create a :ref:`dags` consisting of 2 :ref:`tasks` that train a machine learning model. * Using output versioning with :ref:`context_run_folder`. * Retrying DAG runs and running subset of runs. * Parameterizing our model using YAML and :ref:`topic_context`. * File system interactions with ``default_storage`` and :ref:`topic_storage`. You should now be familiar with the key concepts of Gluepy. To read more details, see * :doc:`Topic guides ` * :doc:`Reference guides `