「JupyterHub」の版間の差分

2023年8月20日 (日) 17:56時点における版

JupyterHubは、Jupyterノートブックのためのマルチユーザーウェブサーバーです。4つのサブシステムから構成されています。

メインハブプロセス
Authenticators はユーザーを認証します。
Spawners は接続された各ユーザーのシングルユーザーサーバーを起動し、監視します。
HTTP プロキシ受信したリクエストをハブまたは適切なシングルユーザーサーバーにルーティングします。

詳しくは JupyterHub ドキュメントの technical overview を参照してください。

インストール

jupyterhub^AUR パッケージをインストールします。ほとんどの場合、jupyter-notebook パッケージもインストールする必要があります（より高度な spawners は必要ない場合もあります）。また、jupyterlab パッケージをインストールすると、JupyterLab インターフェイスを利用できるようになります。

起動

jupyterhub.service を開始/有効化します。デフォルトの設定では、ブラウザで 127.0.0.1:8000 にアクセスしてハブにアクセスできます。

設定

JupyterHub の設定ファイルは /etc/jupyterhub/jupyterhub_config.py に置かれています。これは、設定オブジェクト c を変更する Python スクリプトです。パッケージが提供する設定ファイルには、利用可能な設定オプションとそのデフォルト値が表示されます。

設定中の相対パスは、ハブが実行される作業ディレクトリから解決されます。パッケージが提供する systemd サービスは作業ディレクトリとして /etc/jupyterhub を使用します。これは、例えばデフォルトのデータベース URL c.JupyterHub.db_url = 'sqlite:///jupyterhub.sqlite' がファイル /etc/jupyterhub/jupyterhub.sqlite に対応することを意味しています。

全ての設定オプションはコマンドライン上で上書きすることができます。例えば、設定ファイルの設定 c.Application.show_config = True は、代わりにコマンドラインフラグ --Application.show_config=True で設定することが可能です。提供される systemd サービスはコマンドラインを使用して c.JupyterHub.pid_file と c.ConfigurableHTTPProxy.pid_file をランタイムのディレクトリに明示的に設定するので、設定ファイルにそれらの値があっても無視されることに注意しましょう。

Authenticators

Authenticators はハブと単一ユーザーサーバへのアクセスを制御します。Authenticators セクションのドキュメントには、Authenticators の動作方法やカスタム Authenticator の作成方法についての詳細が含まれています。Authenticators のwikiページには、Authenticators のリストがあり、その中にはAURパッケージを持つものもあり、以下で説明されています。

ユーザーのステータスは、cookie secretによって暗号化されたクッキーに保存されていることに注意してください。異なる Authenticator に切り替える場合、または選択した Authenticator の設定を変更して許可されるユーザーのリストが変更される可能性がある場合は、cookie secretを変更する必要があります。これにより、現在のすべてのユーザーがログアウトされ、新しい設定で再認証を行う必要があります。これは、cookie secretファイルを削除してハブを再起動することで実行でき、新しいシークレットが自動的に生成されます。デフォルトの設定では、cookie secretは/etc/jupyterhub/jupyterhub_cookie_secretに保存されています。

PAM Authenticator

PAM Authenticator は、PAM を使用してローカルユーザーがハブにログインできるようにします。これは JupyterHub に含まれており、デフォルトの Authenticator です。これを使用するには、ハブがユーザーパスワードのハッシュバージョンを含む /etc/shadow の読み取り権限を持っている必要があります。デフォルトでは、/etc/shadow は root が所有し、ファイル権限は -rw------ ですので、root としてハブを実行するとこの要件が満たされます。一部の情報源は、/etc/shadow からすべての権限を削除して、侵害されたデーモンによって読み取られないようにし、アクセスが必要なプロセスに DAC_OVERRIDE ケイパビリティを付与することを推奨しています。もしあなたの /etc/shadow がこのように設定されている場合、この機能を JupyterHub に付与するためのドロップインファイルを作成してください：

/etc/systemd/system/jupyterhub.service.d/override.conf

[Service]
CapabilityBoundingSet=CAP_DAC_OVERRIDE

PAM Authenticator は Python パッケージ pamela に依存しています。基本的なトラブルシューティングはコマンドラインでテストすることができます。ユーザー testuser として認証を試みるには、次のコマンドを実行してください：

# python -m pamela -a testuser

(JupyterHub を非 root ユーザーとして実行する場合は、root の代わりにそのユーザーとしてコマンドを実行してください)。認証が成功すると、出力は表示されません。失敗するとエラーメッセージが表示されます。

非 root ユーザーとしての PAM 認証

JupyterHub を非 root ユーザーとして実行する場合、そのユーザーに shadow ファイルの読み取り権限を付与する必要があります。JupyterHub のドキュメントで推奨されている方法は、shadow グループを作成し、このグループに shadow ファイルを読み取り可能にし、JupyterHub ユーザーをこのグループに追加することです。

警告: これにより、/etc/shadow 内のハッシュ化されたパスワードへの読み取り専用アクセスが JupyterHub ユーザーとしてコードを実行する全てのユーザーに許可されます。各単一ユーザーサーバーは自分のアカウントで実行されるため、それらのサーバーで実行されるコードはアクセス権を持っていないことに注意してください。また、JupyterHub のセキュリティエクスプロイトが root として JupyterHub を実行していた場合、同じハッシュ化されたパスワードへのアクセスが許可されることも注意してください。

グループを作成し、shadow ファイルの権限を変更し、ユーザー jupyterhub をグループに追加することは、次の4つのコマンドで実行できます：

# groupadd shadow
# chgrp shadow /etc/shadow
# chmod g+r /etc/shadow
# usermod -aG shadow jupyterhub

Spawners

Spawners are responsible for starting and monitoring each user's notebook server. The spawners section of the documentation contains more details about how they work and how to write a custom spawner. The spawners wiki page has a list of spawners; some of these have AUR packages and are described below.

LocalProcessSpawner

This is the default spawner included with JupyterHub. It runs each single-user server in a separate local process under their user account (this means each JupyterHub user must correspond to a local user account). It also requires JupyterHub to be run as root so it can spawn the processes under the different user accounts. The jupyter-notebook package must be installed for this spawner to work.

SudoSpawner

The SudoSpawner uses an intermediate process created with sudo to spawn the single-user servers. This allows the JupyterHub process to be run as a non-root user. To use it install the jupyterhub-sudospawner^AUR package.

To use it, create a system user account (the following assumes the account is named jupyterhub) and a group whose membership will define which users can access the hub (here assumed to be called jupyterhub-users). First, we have to configure sudo to allow the jupyterhub user to spawn a server without a password. Create a drop-in sudo configuration file with visudo:

# visudo -f /etc/sudoers.d/jupyterhub-sudospawner

# The command the hub is allowed to run.
Cmnd_Alias SUDOSPAWNER_CMD = /usr/bin/sudospawner

# Allow the jupyterhub user to run this command on behalf of anybody
# in the jupyterhub-users group.
jupyterhub ALL=(%jupyterhub-users) NOPASSWD:SUDOSPAWNER_CMD

The default service file runs the hub as root. It also applies a number of hardening options to the service to restrict its capabilities. This hardening prevents sudo from working; to allow it, the NoNewPrivileges service option (plus any other options which implicitly set it, see systemd.exec(5) for a list of service options) needs to be off. Create a drop-in file to run the hub using the jupyterhub user instead:

/etc/systemd/system/jupyterhub.service.d/override.conf

[Service]
User=jupyterhub
Group=jupyterhub

# Required for sudo.
NoNewPrivileges=false

# Setting the following would implicitly set NoNewPrivileges.
PrivateDevices=false
ProtectKernelTunables=false
ProtectKernelModules=false
LockPersonality=false
RestrictRealtime=false
RestrictSUIDGID=false
SystemCallFilter=
SystemCallArchitectures=

If you have previously run the hub as the root user, you will need to change the ownership of the user database and cookie secret files:

# chown jupyterhub:jupyterhub /etc/jupyterhub/{jupyterhub_cookie_secret,jupyterhub.sqlite}

If you are using the PAMAuthenticator, you will need to configure your system to allow it to work as a non-root user.

Finally, edit the JupyterHub configuration and change the spawner class to SudoSpawner:

/etc/jupyterhub/jupyterhub_config.py

c.JupyterHub.spawner_class='sudospawner.SudoSpawner'

To give a user access to the hub, add them to the jupyterhub-users group:

# usermod -aG jupyterhub-users <username>

systemdspawner

The systemdspawner uses systemd to manage each user's notebook which allows configuring resource limitations, better process isolation and sandboxing, and dynamically allocated users. To use it install the jupyterhub-systemdspawner^AUR package and set the spawner class in the configuration file:

/etc/jupyterhub/jupyterhub_config.py

c.JupyterHub.spawner_class = 'systemdspawner.SystemdSpawner'

Note that as per systemdspawner's readme using it currently requires JupyterHub to be run as root.

Services

A JupyterHub service is defined as a process which interacts with the Hub through its API. Services can either be run by the hub or as standalone processes.

Idle culler

The idle culler service can be used to automatically shut down idle single-user servers. To use it, install the jupyterhub-idle-culler^AUR package. To run the service through the hub, add a service description to the c.JupyterHub.services configuration variable:

/etc/jupyterhub/jupyterhub_config.py

import sys
c.JupyterHub.services = [
    {
        'name': 'idle-culler',
        'admin': True,
        'command': [
            sys.executable,
            '-m', 'jupyterhub_idle_culler',
            '--timeout=3600'
        ],
    }
]

See the service documentation or the output of python -m jupyterhub_idle_culler --help for a description of command-line options and details of how to run the service as a standalone process.

Tips and Tricks

Running as non-root user

By default, the main hub process is run as the root user (the individual user servers are run under the corresponding local user as set by the spawner). To run as a non-root user, you need to use the SudoSpawner (the other spawners listed above require running as root). If you are using the PAM authenticator, you will also need to configure it for a non-root user.

Using a reverse proxy

A reverse proxy can be used to redirect external requests to the JupyterHub instance. This can be useful if you want to serve multiple sites from one machine, or use an existing server to handle SSL. The using a reverse proxy section of the JupyterHub documentation has example configuration for using either nginx or Apache as a reverse proxy.

ノート: This does not replace the proxy component of JupyterHub which is responsible for routing requests to either the main hub or the single-user servers. Rather, the reverse proxy passes external requests to the JupyterHub proxy.

Proxy other web services

The Jupyter Server Proxy extension allows you to run other web services such as Code Server or RStudio alongside JupyterHub and provide authenticated web access to them. To use it, install python-jupyter-server-proxy^AUR and configure it with the /etc/jupyter/jupyter_notebook_config.py file. For instance, to proxy code-server^AUR:

/etc/jupyter/jupyter_notebook_config.py

c.ServerProxy.servers = {
  'code-server': {
    'command': [
      'code-server',
        '--auth=none',
        '--disable-telemetry',
        '--disable-update-check',
        '--bind-addr=localhost:{port}',
        '--user-data-dir=.config/Code - OSS/',
        '--extensions-dir=.vscode-oss/extensions/'
    ],
    'timeout': 20,
    'launcher_entry': {
      'title': 'VS Code'
    }
  }
}

See the documentation for more details about configuring the Jupyter Server Proxy.

@@ 35行目: / 35行目: @@
 === PAM Authenticator ===
+PAM Authenticator は、[[PAM]] を使用してローカルユーザーがハブにログインできるようにします。これは JupyterHub に含まれており、デフォルトの Authenticator です。これを使用するには、ハブがユーザーパスワードのハッシュバージョンを含む {{ic|/etc/shadow}} の読み取り権限を持っている必要があります。デフォルトでは、{{ic|/etc/shadow}} は root が所有し、[[ユーザーとグループ#パーミッションと所有権|ファイル権限]]は {{ic|-rw------}} ですので、root としてハブを実行するとこの要件が満たされます。[[Fedora:Features/LowerProcessCapabilities|一部の情報源]]は、{{ic|/etc/shadow}} からすべての権限を削除して、侵害されたデーモンによって読み取られないようにし、アクセスが必要なプロセスに {{ic|DAC_OVERRIDE}} [[ケイパビリティ]]を付与することを推奨しています。もしあなたの {{ic|/etc/shadow}} がこのように設定されている場合、この機能を JupyterHub に付与するための[[ドロップインファイル]]を作成してください：
-The PAM authenticator uses [[PAM]] to allow local users to log in to the hub. It is included with JupyterHub and is the default authenticator. Using it requires the hub to have read permissions to {{ic|/etc/shadow}} (which contains hashed versions of user passwords) in order to authenticate users. By default {{ic|/etc/shadow}} is owned by root and has [[Users and groups#Permissions and ownership|file permissions]] of {{ic|-rw------}}, so running the hub as root will meet this requirement. [[Fedora:Features/LowerProcessCapabilities|Some sources]] advocate removing all permissions from {{ic|/etc/shadow}} so it cannot be read by compromised daemons, and granting processes which require access the {{ic|DAC_OVERRIDE}} [[capabilities|capability]]. If your {{ic|/etc/shadow}} is set up like this, create a [[drop-in file]] for the service to grant this capability to JupyterHub:
 {{hc|/etc/systemd/system/jupyterhub.service.d/override.conf|2=
@@ 42行目: / 42行目: @@
 }}
-The PAM authenticator relies on the Python package [https://github.com/minrk/pamela pamela]. For basic troubleshooting this can be tested on the commandline. To attempt authentication as user {{ic|testuser}}, run the following command:
+PAM Authenticator は Python パッケージ [https://github.com/minrk/pamela pamela] に依存しています。基本的なトラブルシューティングはコマンドラインでテストすることができます。ユーザー {{ic|testuser}} として認証を試みるには、次のコマンドを実行してください：
  # python -m pamela -a testuser
+(JupyterHub を非 root ユーザーとして実行する場合は、root の代わりにそのユーザーとしてコマンドを実行してください)。認証が成功すると、出力は表示されません。失敗するとエラーメッセージが表示されます。
-(If you run JupyterHub as a non-root user, run the command as that user instead of root). If the authentication succeeds, no output will be printed. If it failed an error message will be printed.
-==== PAM authentication as non-root user ====
+==== 非 root ユーザーとしての PAM 認証 ====
-If you run JupyterHub as a non-root user, you will need to give that user read permissions to the shadow file. The [https://jupyterhub.readthedocs.io/en/stable/reference/config-sudo.html#enable-pam-for-non-root method recommended by the JupyterHub documentation] is to create a {{ic|shadow}} group, make the shadow file readable by this group, and add the JupyterHub user to this group.
+JupyterHub を非 root ユーザーとして実行する場合、そのユーザーに shadow ファイルの読み取り権限を付与する必要があります。[https://jupyterhub.readthedocs.io/en/stable/reference/config-sudo.html#enable-pam-for-non-root JupyterHub のドキュメントで推奨されている方法]は、{{ic|shadow}} グループを作成し、このグループに shadow ファイルを読み取り可能にし、JupyterHub ユーザーをこのグループに追加することです。
+{{Warning|これにより、{{ic|/etc/shadow}} 内のハッシュ化されたパスワードへの読み取り専用アクセスが JupyterHub ユーザーとしてコードを実行する全てのユーザーに許可されます。各単一ユーザーサーバーは自分のアカウントで実行されるため、それらのサーバーで実行されるコードはアクセス権を持っていないことに注意してください。また、JupyterHub のセキュリティエクスプロイトが root として JupyterHub を実行していた場合、同じハッシュ化されたパスワードへのアクセスが許可されることも注意してください。}}
-{{Warning|This allows read-only access to the hashed passwords in {{ic|/etc/shadow}} to anybody running code as the JupyterHub user. Note that each single-user server is run under their own account and so code executed in those servers will not have access. Also note that a security exploit in JupyterHub would allow the same access to the hashed passwords if JupyterHub was being run as root.}}
-Creating the group, modifying the shadow file permissions and adding the user {{ic|jupyterhub}} to the group can be accomplished with the following four commands:
+グループを作成し、shadow ファイルの権限を変更し、ユーザー {{ic|jupyterhub}} をグループに追加することは、次の4つのコマンドで実行できます：
 {{bc|1=
 # groupadd shadow