RunnableAnalysis class

An easier way to replay your analysis on new data

The RunnableAnalysis class is returned when you specify return_type='analysis':

analysis = spreadsheet(return_type='analysis')

Why Use the RunnableAnalysis class?

Mito is build for tool for automation. When you make edits in the Mitosheet, it generates code that can be used to replay those edits across new datasets. To make that automation easier to do in your dashboard app, you can use the RunnableAnalysis class.

First, to help rerun the analysis with new data, the RunnableAnalysis class allows you to access the parameters: the things you can change when re-running the analysis. Currently, the parameter options are either import and export locations.

Furthermore, when you're ready to re-run your analysis, the RunnableAnalysis.run() function allows you to overwrite those parameters with new data. For example, you can apply the same set of edits onto two different CSV files.

To see a fully executable example, scroll to the bottom.

API

get_param_metadata(param_type: Literal['import', 'export'])

You might want to use get_param_metadata to access all of the parameters that you could override in your analysis. However, you can also filter for imports or exports if you only want to override one of those types.

This can be used for displaying input on a dashboard that can be used when rerunning the analysis.

The return type of this function is a list of ParamMetadata objects. They'll look like this:

class ParamMetadata(TypedDict):
    type: ParamType
    subtype: ParamSubtype
    required: bool
    name: str
    original_value: Optional[str]

required

Some fields are defined as required. This means they are required arguments for the run function. Because they were passed as a positional dataframe argument to the spreadsheet function, they aren't stored in the ParamMetadata.

name

This is the name of the variable for this parameter in the code. This can be used for display, but it's main use is to pass that parameter to the run function as a keyword argument.

original_value

This is the value that was originally used for this parameter when creating this analysis. The run function will default to using this if you don't pass this parameter to the function.

Type/Subtype

The ParamType and ParamSubtype types are used to describe the usage of the parameter. So the "type" of a parameter will either be 'import' or 'export', and the 'subtype' will describe whether the file is a csv or excel or was passed in other ways. The types are defined as:

ParamType = Literal[
    'import',
    'export'
]

ParamSubtype = Literal[
    'import_dataframe',
    'file_name_export_excel',
    'file_name_export_csv',
    'file_name_import_excel',
    'file_name_import_csv',
    'all' # This represents all of the above
]

Example Usage

You could use it to display file uploaders for each import in the analysis:

import streamlit as st
from mitosheet.streamlit.v1 import spreadsheet

# Set the streamlit page to wide so you can see the whole spreadsheet
st.set_page_config(layout="wide")

# Create the spreadsheet with return type 'analysis'
analysis = spreadsheet(import_folder='datasets', return_type='analysis')

# Get all of the imports parameters for that analysis
import_params = analysis.get_param_metadata('import')

# Use the parameter metadata to display the params
for param in import_params:
    st.file_uploader(param['name'])

run(*args, **kwargs)

This is the function that you'd want to call to rerun your analysis with new data. This is designed to allow for overriding the original values of each parameter. However, for imports that were passed as a positional argument to the spreadsheet function, a value will be required to be passed to this function.

The name value in the ParamMetadata should be used as the keyword for that param. So, for example:

import streamlit as st
from mitosheet.streamlit.v1 import spreadsheet

# Set the streamlit page to wide so you can see the whole spreadsheet
st.set_page_config(layout="wide")

# Create the spreadsheet with return type 'analysis'
analysis = spreadsheet(import_folder='datasets', return_type='analysis')

# Get all of the import parameters for that analysis
import_params = analysis.get_param_metadata('import')

print(import_params[0]['name'])
# Output: file_name_import_csv_0

analysis.run(file_name_import_csv_0='/path/to/new/data.csv')

to_json and from_json

For easier storage of analyses, you can use to_json and from_json to store the analysis object. For example:

import streamlit as st
from mitosheet.streamlit.v1 import spreadsheet

# Set the streamlit page to wide so you can see the whole spreadsheet
st.set_page_config(layout="wide")

# Create the spreadsheet with return type 'analysis'
analysis = spreadsheet(return_type='analysis')

analysis_json = analysis.to_json()

# Store analysis_json somewhere. Note that it should be stored securely, as it
# may contain code that edits private data
#############################

# Then, load an analysis from a file:
analysis_file_contents = <load analysis json here>

new_analysis_from_file = RunnableAnalysis.from_json(analysis_file_contents)

Example Usage

This is an example of using the RunnableAnalysis class from start to finish, including gathering new values for each parameter and creating a button to re-run the analysis on that new data.

If you want to run this code locally, make sure to have a folder called 'datasets' with the data you want to use (in the directory you're starting streamlit from).

If you use the Mitosheet to import data from the newly created datasets directory you've created, these imports will appear in the dashboard! Configuring them will rerun the analysis on new data.

import streamlit as st
import pandas as pd 
from mitosheet.streamlit.v1 import spreadsheet

# Set the streamlit page to wide so you can see the whole spreadsheet
st.set_page_config(layout="wide")

# Create an empty spreadsheet
analysis = spreadsheet(
    import_folder='datasets',
    return_type='analysis'
)

# Create an object to store the new values for the parameters
updated_metadata = {}

# Loop through the parameters in the analysis to display imports
for idx, param in enumerate(analysis.get_param_metadata()):
    new_param = None

    # For imports that are exports, display a text input
    if param['subtype'] in ['file_name_export_excel', 'file_name_export_csv']:
        new_param = st.text_input(param['name'], value=param['initial_value'], key=idx)
        
    # For imports that are file imports, display a file uploader
    elif param['subtype'] in ['file_name_import_excel', 'file_name_import_csv']:
        new_param = st.file_uploader(param['name'], key=idx)
    
    if new_param is not None:
        updated_metadata[param['name']] = new_param

# Show a button to trigger re-running the analysis with the updated_metadata
run = st.button('Run')
if run:
    result = analysis.run(**updated_metadata)
    st.write(result)

Last updated

© Mito