I wanted a single keyboard shortcut that would let me switch between the built-in terminal and the editor in vscode. I couldn’t find any, so I made a short macro using the VSCode Macros Extension

First, you define your macros in settings.json. You can open up vscode settings with Cmd+,, search for macros, and edit this under Macros: List.


"macros.list": {
"showTerminal": [
"workbench.action.terminal.toggleTerminal",
"workbench.action.terminal.focus",
"workbench.action.toggleMaximizedPanel"
],
"hideTerminal": [
"workbench.action.toggleMaximizedPanel",
"workbench.action.terminal.toggleTerminal"
]
}


Then you can bind a keyboard shortcut to it. Open Preferences: Open Keyboard Shortcuts (JSON) from the command palette, and add this:

    {
"key": "cmd+t",
"command": "macros.showTerminal",
"when": "!terminalFocus"
},
{
"key": "cmd+t",
"command": "macros.hideTerminal",
"when": "terminalFocus"
},


showTerminal gets called when you are focused anywhere in the editor except your terminal, and hideTerminal gets called when you are focused on your terminal. This gives me the ‘toggle’ behavior I want.

Sometimes you want to temporarily disable a JupyterLab extension on a JupyterHub by default, without having to rebuild your docker image. This can be very easily done with z2jh’s singleuser.extraFiles, and JupyterLab’s page_config.json

JupyterLab’s page_config.json lets you set page configuration by dropping JSON files under a labconfig directory inside any of the directories listed when you run jupyter --paths. We just use singleuser.extraFiles to provide this file!

singleuser:
extraFiles:
lab-config:
mountPath: /etc/jupyter/labconfig/page_config.json
data:
disabledExtensions:


This will disable the link-share labextension, both in JupyterLab and RetroLab. You can find the name of the extension, as well as its current status, with jupyter labextension list.

## Others?

I’m sure this isn’t the end - probably need something about firewalls, monitoring and automated system package upgrades. But hey, great start!

You can run your Jupyter Notebook locally, and connect easily to a remote dask-kubernetes cluster on a cloud-based Kubernetes Cluster with the help of kubefwd. This notebook will show you an example of how to do so. While this example is a Jupyter Notebook, the code will work any local python medium - REPL, IDE (vscode), or just plain ol' .py files

Latest executable version of this notebook can be found in this repository

## Credit

This work was sponsored by Quansight ❤️

## Create & setup a Kubernetes cluster

You need to have a working kubernetes cluster that is configured correctly. If you can get kubectl get ns to work properly, it means your cluster is working fine & connected for this to go.

## Install & run kubefwd

kubefwd lets you access services in your cloud Kubernetes cluster as if they were localy, with a clever combination of ol' school /etc/hosts hacks & fancy kubernetes port-forwarding. It requires root on Mac OS & Linux, and should theoretically work on Windows too (haven’t tested).

Once you have installed it, run it in a separate terminal.

sudo kubefwd svc -n default -n kube-system


If you’ve created your own namespace for your cluster, use that instead of default. The kube-system is required until this issue is fixed.

If the kubefwd command runs successfully, we’re good to go!

## Install libraries we’ll need

In addition to dask-kubernetes, we’ll also need numpy to test our cluster with dask arrays.

%pip install numpy dask distributed dask-kubernetes


Normally, the pod template would come from an external configuration file. We keep this in the notebook to make it more self contained.

POD_SPEC = {
'kind': 'Pod',
'spec': {
'restartPolicy': 'Never',
'containers': [
{
'args': [
'--death-timeout', '60'
],
}
]
}
}


## Create a remote cluster & connect to it

We create a KubeCluster object, with deploymode='remote'. This creates the scheduler as a pod on the cluster, so worker <-> scheduler communication is easy & efficient. kubefwd helps us communicate to this remote scheduler, so we can pretend we are actually on the remote cluster.

If you are using kubectl to watch the objects created in your namespace, you’ll see a service and a pod created for this. kubefwd should also list a log line about forwarding the service port locally.

from dask_kubernetes import KubeCluster

cluster = KubeCluster.from_dict(POD_SPEC, deploy_mode='remote')


## Create some workers

We have a scheduler in the cloud, now time to create some workers in the cloud! We create 2, and can watch the worker pods come up with glee in kubectl

All scaling methods (adaptive scaling, using the widget, etc) should work here.

cluster.scale(2)


## Run some computation!

We test our cluster by doing some trivial calculations with dask arrays. You can use any dask code as you normally would here, and it would run on the cloud Kubernetes cluster. This is especially helpful if you have large amounts of data in the cloud, since the workers would be really close to where the data is.

You might get warnings about version mismatches. This is ok for the demo, in production you’d probably build your own docker image that will have fixed versions.

from dask.distributed import Client

# Connect Dask to the cluster
client = Client(cluster)

# Create a large array and calculate the mean
array = da.ones((1000, 1000, 1000))
print(array.mean().compute())  # Should print 1.0


When you’re done with your cluster, remember to clean it up to release the resources!

This doesn’t affect your kubernetes cluster itself - you’ll need to clean that up manually

cluster.close()


## Next steps?

Lots more we can do from here.

### Ephemeral Kubernetes Clusters

We can wrap kubernetes cluster creation in some nice python functions, letting users create kubernetes clusters just-in-time for running a dask-kubernetes cluster, and tearing it down when they’re done. Users can thus ‘bring their own compute’ - since the clusters will be in their cloud accounts - without having the complication of understanding how the cloud works. This is where this would be different from the wonderful dask-gateway project I think.

### Remove kubefwd

kubefwd isn’t strictly necessary, and should ideally be replaced by a kubectl port-forward call that doesn’t require root. This should be possible with some changes to the dask-kubernetes code, so the client can connect to the scheduler via a different address (say, localhost:8974, since that’s what kubectl port-forward gives us) vs the workers (which need something like dask-cluster-scheduler-c12.namespace:8786, since that is in-cluster address).

Longer term, it would be great if we can get rid of spawning other processes altogether, if/when the python kubernetes client gains the ability to port-forward.

### Integration with TLJH

I love The Littlest JupyterHub (TLJH). A common use case is that a group of users need a JupyterHub, mostly doing work that’s well seved by TLJH. However, sometimes they need to scale up to a big dask cluster to do some work, but not for long. In these cases, I believe a combination of TLJH + Ephemeral Kubernetes Clusters is far simpler & easier to manage than running a full Kubernetes based JupyterHub. In addition, we can share the conda environment from TLJH with the dask workers, removing the need for users to think about docker images or environment mismatches completely. This is a massive win, and merits further exploration.

### ???

I am not actually an end user of dask, so I’m sure actual dask users will have much more ideas. Or they won’t, and this will just end up being a clever hack that gives me some joy :D Who knows!