Reachy Mini documentation

Media

Reachy Mini

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

Media

Media Manager

class reachy_mini.media.media_manager.MediaManager

( backend: MediaBackend = <MediaBackend.LOCAL: 'local'> log_level: str = 'INFO' signalling_host: str = 'localhost' camera_specs: Optional[CameraSpecs] = None daemon_url: str = '' )

Parameters

logger — Logger instance for media-related messages.
backend — The selected media backend (after deprecation resolution).
camera — Camera device instance, or None.
audio — Audio device instance, or None.

Media Manager for handling camera and audio devices.

This class provides a unified interface for managing both camera and audio devices across different backends. It handles initialization, configuration, and cleanup of media resources.

close

( )

Close the media manager and release resources.

get_DoA

( )

Get the Direction of Arrival (DoA) from the microphone array.

get_audio_sample

( )

Get an audio sample from the audio device.

get_frame

( )

Get a frame from the camera.

get_input_audio_samplerate

( )

Get the input samplerate of the audio device.

get_input_channels

( )

Get the number of input channels of the audio device.

get_output_audio_samplerate

( )

Get the output samplerate of the audio device.

get_output_channels

( )

Get the number of output channels of the audio device.

play_sound

( sound_file: str )

Parameters

sound_file — Path to the sound file to play.

Play a sound file.

Note: If the audio backend is not initialised, a warning is logged and the call is silently ignored.

push_audio_sample

( data: npt.NDArray[np.float32] )

Parameters

data — Audio samples as a float32 array. Shape should be (num_samples,) for mono or (num_samples, channels) for multi-channel. The manager adapts the data to match the output device’s channel count before forwarding it.

Push audio data to the output device.

start_playing

( )

Start playing audio.

start_recording

( )

Start recording audio.

stop_playing

( )

Stop playing audio.

stop_recording

( )

Stop recording audio.

Audio

class reachy_mini.media.audio_gstreamer.GStreamerAudio

( log_level: str = 'INFO' )

Audio implementation using GStreamer.

Extends AudioBase with two GStreamer-specific helpers:

clear_output_buffer(): flush queued playback data without stopping the pipeline (no-op by default; useful before refilling the buffer).
clear_player(): flush the playback appsrc immediately via GStreamer flush events, dropping any queued audio.

cleanup

( )

Release all resources (pipelines, USB devices).

clear_output_buffer

( )

Flush queued playback data so it is not played.

A low set_max_output_buffers value may make this unnecessary for most use-cases.

clear_player

( )

Flush the player’s appsrc to drop any queued audio immediately.

delete_sound

( filename: str )

No-op for the local backend.

list_sounds

( )

No-op for the local backend.

play_sound

( sound_file: str )

Parameters

sound_file — Absolute path or filename relative to the built-in assets directory.

Raises

FileNotFoundError

FileNotFoundError — If the file cannot be found.

Play a sound file through the Reachy Mini Audio card.

The file is played via a GStreamer playbin routed to the same audio sink used by the push-based playback pipeline.

push_audio_sample

( data: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float32]] )

Parameters

data — Audio samples as a float32 array. Shape should be (num_samples, 2) for stereo or (num_samples,) for mono (the caller is responsible for channel adaptation).

Push audio data to the speaker.

start_playing

( )

Start the playback pipeline so push_audio_sample can feed data.

upload_sound

( sound_file: str )

No-op for the local backend — the file is already accessible.

Audio Utils Functions

reachy_mini.media.audio_utils.get_respeaker_card_number

( ) → int

Returns

int

The card number of the detected ReSpeaker/Reachy Mini Audio device. Returns 0 if no specific device is found (uses default sound card), or -1 if there’s an error running the detection command.

Return the card number of the ReSpeaker sound card, or 0 if not found.

Note: This function runs ‘arecord -l’ to list available audio capture devices and processes the output to find Reachy Mini Audio or ReSpeaker devices. It’s primarily used on Linux systems with ALSA audio configuration.

The function returns:

Positive integer: Card number of detected Reachy Mini Audio device
0: No Reachy Mini Audio device found, using default sound card
-1: Error occurred while trying to detect audio devices

Example:

card_num = get_respeaker_card_number()
if card_num > 0:
    print(f"Using Reachy Mini Audio card {card_num}")
elif card_num == 0:
    print("Using default sound card")
else:
    print("Error detecting audio devices")

reachy_mini.media.audio_utils.has_reachymini_asoundrc

( ) → bool

Returns

bool

True if ~/.asoundrc exists and contains the required Reachy Mini audio configuration entries, False otherwise.

Check if ~/.asoundrc exists and contains both reachymini_audio_sink and reachymini_audio_src.

Note: This function checks for the presence of the ALSA configuration file ~/.asoundrc and verifies that it contains the necessary configuration entries for Reachy Mini audio devices (reachymini_audio_sink and reachymini_audio_src). These entries are required for proper audio routing and device management.

Example:

if has_reachymini_asoundrc():
    print("Reachy Mini audio configuration is properly set up")
else:
    print("Need to configure Reachy Mini audio devices")
    write_asoundrc_to_home()  # Create the configuration

reachy_mini.media.audio_utils.check_reachymini_asoundrc

( )

Check if ~/.asoundrc exists and is correctly configured for Reachy Mini Audio.

reachy_mini.media.audio_utils.write_asoundrc_to_home

( )

Write the .asoundrc file with Reachy Mini audio configuration to the user’s home directory.

This function creates an ALSA configuration file (.asoundrc) in the user’s home directory that configures the ReSpeaker sound card for proper audio routing and multi-client support. The configuration enables simultaneous audio input and output access, which is essential for the Reachy Mini Wireless version’s audio functionality.

The generated configuration includes:

Default audio device settings pointing to the ReSpeaker sound card
dmix plugin for multi-client audio output (reachymini_audio_sink)
dsnoop plugin for multi-client audio input (reachymini_audio_src)
Proper buffer and sample rate settings for optimal performance

Note: This function automatically detects the ReSpeaker card number and creates a configuration tailored to the detected hardware. It is primarily used for the Reachy Mini Wireless version.

The configuration file will be created at ~/.asoundrc and will overwrite any existing file with the same name. Existing audio configurations should be backed up before calling this function.

Audio Control Utils Functions

class reachy_mini.media.audio_control_utils.ReSpeaker

( dev: Device )

Class to interface with the ReSpeaker XVF3800 USB device.

close

( )

Close the interface.

read

( name: str )

Read data from a specified parameter on the ReSpeaker device.

write

( name: str data_list: Any )

Write data to a specified parameter on the ReSpeaker device.

reachy_mini.media.audio_control_utils.find

( vid: int = 10374 pid: int = 26 ) → ReSpeaker | None

Parameters

vid (int) — USB Vendor ID to search for. Default: 0x2886 (XMOS).
pid (int) — USB Product ID to search for. Default: 0x001A (XMOS XVF3800).

Returns

ReSpeaker | None

A ReSpeaker object if the device is found, None otherwise.

Find and return the ReSpeaker USB device with the given Vendor ID and Product ID.

Note: This function searches for USB devices with the specified Vendor ID and Product ID using libusb backend. The default values target XMOS XVF3800 devices used in ReSpeaker microphone arrays.

Example:

from reachy_mini.media.audio_control_utils import find

# Find default ReSpeaker device
respeaker = find()
if respeaker is not None:
    print("Found ReSpeaker device")
    respeaker.close()

# Find specific device
custom_device = find(vid=0x1234, pid=0x5678)

reachy_mini.media.audio_control_utils.init_respeaker_usb

( ) → Optional[ReSpeaker]

Returns

Optional[ReSpeaker]

A ReSpeaker object if a compatible device is found, None otherwise.

Initialize the ReSpeaker USB device. Looks for both new and beta device IDs.

Note: This function attempts to initialize a ReSpeaker microphone array by searching for USB devices with known Vendor and Product IDs. It tries:

New Reachy Mini Audio firmware (0x38FB:0x1001) - preferred
Old ReSpeaker firmware (0x2886:0x001A) - with warning to update

The function handles USB backend errors gracefully and returns None if no compatible device is found or if initialization fails.

Example:

from reachy_mini.media.audio_control_utils import init_respeaker_usb

# Initialize ReSpeaker device
respeaker = init_respeaker_usb()
if respeaker is not None:
    print("ReSpeaker initialized successfully")
    # Use the device...
    doa = respeaker.read("DOA_VALUE_RADIANS")
    respeaker.close()
else:
    print("No ReSpeaker device found")

Camera

class reachy_mini.media.camera_gstreamer.GStreamerCamera

( log_level: str = 'INFO' camera_specs: typing.Optional[reachy_mini.media.camera_constants.CameraSpecs] = None )

Parameters

camera_specs — Camera specifications (resolutions, intrinsics, …).

Camera that reads BGR frames from the daemon’s local IPC endpoint.

The WebRTC daemon exposes BGR camera frames via a local IPC mechanism:

Linux / macOS: unixfdsink / unixfdsrc (Unix domain socket)
Windows: win32ipcvideosink / win32ipcvideosrc (shared memory)

Since the daemon’s IPC branch already converts to BGR, the reader pipeline is simply source → queue → appsink with no extra conversion.

close

( )

Stop the pipeline and release resources.

open

( )

Start the GStreamer pipeline and begin receiving frames.

read

( )

Pull the latest BGR frame from the IPC endpoint.

Camera Utils Functions

reachy_mini.media.camera_utils.undistort_points

( u: float v: float K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] max_iterations: int = 20 epsilon: float = 0.01 ) → Tuple (x_n, y_n)

Parameters

u — Horizontal pixel coordinate.
v — Vertical pixel coordinate.
K — 3x3 camera intrinsic matrix [[fx, 0, cx], [0, fy, cy], [0, 0, 1]].
D — Distortion coefficients array. Supports lengths 0, 4, 5, 8, 12, or 14. Unused positions default to 0.
max_iterations — Maximum number of iterations (default 20).
epsilon — Convergence threshold in pixel reprojection error (default 0.01).

Returns

Tuple (x_n, y_n)

Normalized undistorted coordinates (on the z=1 plane).

Undistort a single pixel coordinate to normalized camera coordinates.

Pure numpy equivalent of cv2.undistortPoints(). Supports the OpenCV distortion model with up to 12 coefficients (rational model + thin prism): D = (k1, k2, p1, p2, k3, k4, k5, k6, s1, s2, s3, s4)

Also works with 5-coefficient models (k1, k2, p1, p2, k3) and zero-distortion.

The algorithm matches OpenCV’s cvUndistortPointsInternal:

Remove camera intrinsics to get normalized distorted coordinates.
Iteratively solve for undistorted coordinates using a damped fixed-point method with adaptive step size.

Reference: OpenCV distortion model and undistortPoints algorithm: https://docs.opencv.org/4.x/d9/d0c/group__calib3d.html https://github.com/opencv/opencv/blob/4.x/modules/calib3d/src/undistort.dispatch.cpp

reachy_mini.media.camera_utils.scale_intrinsics

( K_original: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] original_size: typing.Tuple[int, int] target_size: typing.Tuple[int, int] crop_scale: float ) → K_scaled

Parameters

K_original — Original 3x3 camera matrix
original_size — (width, height) of original calibration
target_size — (width, height) of target resolution
crop_scale — Scale factor due to digital zoom/crop (>1 means more zoomed in)

Returns

K_scaled

Adjusted camera matrix for target resolution

Scale camera intrinsics for a different resolution with cropping.

Camera Constants

class reachy_mini.media.camera_constants.CameraResolution

( *values )

Parameters

R1536x864at40fps — 1536x864 resolution at 40 fps
R1280x720at60fps — 1280x720 resolution at 60 fps (HD)
R1280x720at30fps — 1280x720 resolution at 30 fps (HD)
R1920x1080at30fps — 1920x1080 resolution at 30 fps (Full HD)
R1920x1080at60fps — 1920x1080 resolution at 60 fps (Full HD)
R2304x1296at30fps — 2304x1296 resolution at 30 fps
R1600x1200at30fps — 1600x1200 resolution at 30 fps
R3264x2448at30fps — 3264x2448 resolution at 30 fps
R3264x2448at10fps — 3264x2448 resolution at 10 fps
R3840x2592at30fps — 3840x2592 resolution at 30 fps
R3840x2592at10fps — 3840x2592 resolution at 10 fps
R3840x2160at30fps — 3840x2160 resolution at 30 fps (4K UHD)
R3840x2160at10fps — 3840x2160 resolution at 10 fps (4K UHD)
R3072x1728at10fps — 3072x1728 resolution at 10 fps
R4608x2592at10fps — 4608x2592 resolution at 10 fps

Base class for camera resolutions.

Enumeration of standardized camera resolutions and frame rates supported by Reachy Mini cameras. Each enum value contains a tuple of (width, height, fps).

Note: The enum values are tuples containing (width, height, frames_per_second, crop_factor). Not all resolutions are supported by all camera models - check the specific camera specifications for available resolutions.

Example:

from reachy_mini.media.camera_constants import CameraResolution

# Get resolution information
res = CameraResolution.R1280x720at30fps
width, height, fps, crop_factor = res.value
print(f"Resolution: {width}x{height}@{fps}fps")

# Check if a resolution is supported by a camera
from reachy_mini.media.camera_constants import ReachyMiniLiteCamSpecs
res = CameraResolution.R1920x1080at60fps
if res in ReachyMiniLiteCamSpecs.available_resolutions:
    print("This resolution is supported")

class reachy_mini.media.camera_constants.CameraSpecs

( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )

Parameters

name (str) — Human-readable name of the camera model.
available_resolutions (List[CameraResolution]) — List of supported resolutions and frame rates for this camera model.
default_resolution (CameraResolution) — Default resolution used when the camera is initialized.
vid (int) — USB Vendor ID for identifying this camera model.
pid (int) — USB Product ID for identifying this camera model.
K (npt.NDArray[np.float64]) — 3x3 camera intrinsic matrix containing focal lengths and principal point coordinates.
D (npt.NDArray[np.float64]) — 5-element array containing distortion coefficients (k1, k2, p1, p2, k3) for radial and tangential distortion.

Base camera specifications.

Dataclass containing specifications for a camera model, including supported resolutions, calibration parameters, and USB identification information.

Note: The intrinsic matrix K has the format: [[fx, 0, cx], [ 0, fy, cy], [ 0, 0, 1]]

Where fx, fy are focal lengths in pixels, and cx, cy are the principal point coordinates (typically near the image center).

Example:

from reachy_mini.media.camera_constants import CameraSpecs

# Create a custom camera specification
custom_specs = CameraSpecs(
    name="custom_camera",
    available_resolutions=[CameraResolution.R1280x720at30fps],
    default_resolution=CameraResolution.R1280x720at30fps,
    vid=0x1234,
    pid=0x5678,
    K=np.array([[800, 0, 640], [0, 800, 360], [0, 0, 1]]),
    D=np.zeros(5)
)

class reachy_mini.media.camera_constants.ArducamSpecs

( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )

Arducam camera specifications.

class reachy_mini.media.camera_constants.ReachyMiniLiteCamSpecs

( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )

Reachy Mini Lite camera specifications.

class reachy_mini.media.camera_constants.ReachyMiniWirelessCamSpecs

( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )

Reachy Mini Wireless camera specifications.

class reachy_mini.media.camera_constants.OlderRPiCamSpecs

( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )

Older Raspberry Pi camera specifications. Keeping for compatibility.

class reachy_mini.media.camera_constants.MujocoCameraSpecs

( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )

Mujoco simulated camera specifications.

class reachy_mini.media.camera_constants.GenericWebcamSpecs

( name: str = '' available_resolutions: typing.List[reachy_mini.media.camera_constants.CameraResolution] = <factory> default_resolution: CameraResolution = <CameraResolution.R1280x720at30fps: (1280, 720, 30, 1.0)> K: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> D: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float64]] = <factory> )

Generic webcam specifications (fallback for any webcam).

WebRTC

class reachy_mini.media.webrtc_client_gstreamer.GstWebRTCClient

( log_level: str = 'INFO' peer_id: str = '' signaling_host: str = '' signaling_port: int = 8443 camera_specs: typing.Optional[reachy_mini.media.camera_constants.CameraSpecs] = None )

WebRTC client that provides both camera frames and audio.

Implements the same public API surface as GStreamerCamera (for video) and GStreamerAudio (for audio) so that MediaManager can assign the same instance to both its camera and audio slots.

cleanup

( )

Release all resources.

clear_output_buffer

( )

No-op (WebRTC send chain does not buffer significantly).

close

( )

Stop the WebRTC pipeline.

delete_sound

( filename: str )

Parameters

filename — Name of the file to delete (not a full path).

Delete a sound file from the daemon’s temporary sound directory.

get_DoA

( )

Get the Direction of Arrival from the ReSpeaker.

list_sounds

( )

List sound files in the daemon’s temporary sound directory.

open

( )

Start the WebRTC pipeline (both video and audio).

play_sound

( sound_file: str )

Parameters

sound_file — Absolute local path or asset filename (e.g. "wake_up.wav").

Play a sound file on the robot’s speaker via the daemon REST API.

If sound_file is a local path that exists on this machine the file is automatically uploaded to the daemon’s temporary sound directory (skipping the upload when a file with the same name is already present). Otherwise the filename is sent as-is and the daemon resolves it from its built-in assets or filesystem.

push_audio_sample

( data: numpy.ndarray[tuple[typing.Any, ...], numpy.dtype[numpy.float32]] )

Parameters

data — Float32 audio samples.

Push audio data to the remote peer via WebRTC.

read

( )

Pull the latest BGR video frame.

start_playing

( )

No-op — audio send chain is set up automatically on WebRTC connection.

start_recording

( )

No-op — recording starts automatically with open().

stop_playing

( )

Reset the PTS counter for the send chain and stop daemon-side sound.

stop_recording

( )

No-op — managed by close().

upload_sound

( sound_file: str )

Parameters

sound_file — Local path to the sound file.

Raises

FileNotFoundError or requests.HTTPError

FileNotFoundError — If sound_file does not exist locally.
requests.HTTPError — If the upload request fails.

Upload a local sound file to the daemon’s temporary directory.

class reachy_mini.media.media_server.GstMediaServer

( log_level: str = 'INFO' sim_mode: SimulationMode = <SimulationMode.NONE: 'none'> )

Parameters

camera_specs (CameraSpecs) — Specifications of the detected camera.
resized_K (npt.NDArray[np.float64]) — Camera intrinsic matrix for current resolution.

Daemon-side GStreamer media server.

Owns the camera and audio hardware and distributes media to consumers:

IPC branch — raw BGR frames via unixfdsink / win32ipcvideosink for on-device applications (GStreamerCamera reads from this).
WebRTC branch — encoded video + audio via webrtcsink for remote clients (GstWebRTCClient connects to this).
Sound playback — playbin for playing WAV files on the speaker.

close

( )

Release GStreamer resources (MainLoop, bus watch).

play_sound

( sound_file: str )

Parameters

sound_file — Path to the sound file to play. If the file is not found at the given path, it is looked up in the assets directory.

Play a sound file on the robot’s speaker.

Uses GStreamer’s playbin element with a platform-aware audio sink. This is used for daemon-side sounds (wake-up, sleep, etc.).

send_data_message

( peer_id: typing.Optional[str] message: str )

Parameters

message — The string message to send
peer_id — If specified, send only to this peer. Otherwise broadcast to all.

Send a message to connected peers via data channel.

set_message_handler

( handler: typing.Callable[[str, str], NoneType] )

Parameters

handler — Callback function that receives (peer_id, message)

Set a callback for incoming data channel messages.

start

( )

Rebuild the pipeline from scratch and start it.

Rebuilding ensures a clean state after stop() released all hardware.

stop

( )

Stop the pipeline and release all hardware (camera, audio).

stop_sound

( )

Stop the currently playing sound file.

If no sound is currently playing this is a no-op.

Update on GitHub

←Reachy Mini Motion→