interfaces

Abstract camera interfaces.

This module defines abstract camera interface classes for RGB, depth, stereo, and point-cloud capable sensors. It provides a consistent API for frame acquisition, camera intrinsics access, robot attachment, and conversion helpers across concrete camera backends. These interfaces establish shared coordinate and image conventions to ensure interchangeable camera implementations throughout the robotblockset camera stack.

Key functionalities include: - Abstract base interfaces for generic, RGB, RGB-D, and stereo RGB-D cameras. - Unified methods for grabbing and retrieving RGB, depth, confidence, and point-cloud data. - Camera intrinsics and resolution access through consistent typed APIs. - Attachment/detachment support for associating cameras with robot instances. - Coordinate-frame and image-indexing conventions for backend consistency. - Shared conversion helpers for point-cloud representations.

class robotblockset.cameras.interfaces.Camera[source]

Bases: ABC, rbs_object

Base class for all cameras

We use the right-handed, y-down convention for the camera frame: - origin is at the camera lens center - z-axis points forward (towards the scene) - x-axis points to the right - y-axis points down cf. https://www.stereolabs.com/docs/positional-tracking/coordinate-frames/#selecting-a-coordinate-system for an overview of other conventions

We use the (associated) top-left convention for the image plane: - origin is at the top left corner of the image - x-axis points to the right - y-axis points down

keep in mind though that images are indexed column-row in numpy, which corresponds to y-x in the cartesian image coordinates so to get the value of the pixel at (u,v) you need to do image[v,u] and the shape of the numpy array is (height, width) cf https://scikit-image.org/docs/stable/user_guide/numpy_images.html#numpy-indexing

Initialize the base object state.

Returns:

This constructor initializes the object name and verbosity level.

Return type:

None

Name = 'Camera'
Robot: robot | None = None
AttachTo(robot: robot) None[source]

Attach the sensor to a robot.

Parameters:

robot ('robot') – robot The robot to which the sensor is attached.

Detach() None[source]

Detach the sensor from the robot.

Sets the Robot attribute to None.

GetAttachedRobot() Tuple[robot | None, str][source]

Get the robot to which the sensor is attached.

Returns:

The robot object and its name, or (None, “None”) if the sensor is not attached to a robot.

Return type:

Tuple[Optional[robot], str]

abstractmethod intrinsics_matrix() ndarray[source]

returns the intrinsics matrix of the camera:

[[fx, 0, cx],

[0, fy, cy], [0, 0, 1]]

where all values are defined in pixels.

class robotblockset.cameras.interfaces.RGBCamera[source]

Bases: Camera, ABC

Base class for all RGB cameras

Initialize the base object state.

Returns:

This constructor initializes the object name and verbosity level.

Return type:

None

Name = 'RGBCamera'
abstract property resolution: Tuple[int, int]

The resolution of the camera, in pixels.

abstract property fps: float

The frames per second of the camera.

get_rgb_image() ndarray[source]

Get a new RGB image from the camera.

get_rgb_image_as_int() ndarray[source]

Get a new RGB image from the camera as uint8. This is faster to retrieve as images are typically stored as ints in the buffer. It is also more compact, so recommended for communication.

class robotblockset.cameras.interfaces.DepthCamera[source]

Bases: Camera, ABC

Base class for all depth cameras

Initialize the base object state.

Returns:

This constructor initializes the object name and verbosity level.

Return type:

None

Name = 'DepthCamera'
get_depth_map() ndarray[source]

Get the latest depth map of the camera. The depth map is a 2D array of floats, that provide the estimated z-coordinate in the camera frame of that point on the image plane (pixel).

Returns:

NumpyDepthMapType

Return type:

depth map as a float image.

get_depth_image() ndarray[source]

Return an 8-bit visualization depth image.

an 8-bit (int) quantization of the latest depth map, which can be used for visualization

get_confidence_map() ndarray[source]

Get a confidence map for the depth map.

The confidence map is a 2D array of floats, that provide a measure of confidence in the depth estimate for each pixel. The values are between 0 and 1, where 1 indicates high confidence and 0 indicates low confidence.

Note that not all stereo depth cameras provide a confidence map. We provide a default (naive) implementation. It computes a confidence map based on depth discontinuities using OpenCV’s Canny edge detection. This is based on the assumption that depth estimates are less reliable at depth discontinuities.

Child classes can override this method to provide a more accurate confidence map if available. For example: - StereoRGBDCamera provides a confidence map based on disparity between left and right RGB images using OpenCV’s SGBM algorithm implementation. - Realsense provides a confidence map based on disparity between left and right infrared images using OpenCV’s SGBM algorithm implementation. - ZED provides a confidence map based on the camera’s internal confidence measure.

See also: https://github.com/opencv/opencv_contrib/blob/master/modules/ximgproc/samples/disparity_filtering.cpp

Returns:

NumpyConfidenceMapType

Return type:

confidence map with values between 0 and 1.

class robotblockset.cameras.interfaces.RGBDCamera[source]

Bases: RGBCamera, DepthCamera

Base class for all RGBD cameras

Initialize the base object state.

Returns:

This constructor initializes the object name and verbosity level.

Return type:

None

Name = 'RGBDCamera'
get_colored_point_cloud() PointCloud[source]

Get the latest point cloud of the camera. The point cloud contains the estimated position in the camera frame of points on the image plane (pixels). Each point also has a color associated with it, which is the color of the corresponding pixel in the RGB image.

Returns:

PointCloud

Return type:

the points (= positions) and colors

class robotblockset.cameras.interfaces.StereoRGBDCamera[source]

Bases: RGBDCamera

Base class for all stereo RGBD cameras

Initialize the base object state.

Returns:

This constructor initializes the object name and verbosity level.

Return type:

None

Name = 'StereoRGBDCamera'
LEFT_RGB = 'left'
RIGHT_RGB = 'right'
get_rgb_image(view: str = 'left') ndarray[source]

Get a new RGB image from the camera.

get_rgb_image_as_int(view: str = 'left') ndarray[source]

Get a new RGB image from the camera as uint8. This is faster to retrieve as images are typically stored as ints in the buffer. It is also more compact, so recommended for communication.

abstractmethod intrinsics_matrix(view: str = 'left') ndarray[source]

returns the intrinsics matrix of the camera:

[[fx, 0, cx],

[0, fy, cy], [0, 0, 1]]

where all values are defined in pixels.

abstract property pose_of_right_view_in_left_view: ndarray

get the transform of the right view frame in the left view frame, ususally this is simply a translation along the x-axis and referred to as the disparity. cf https://en.wikipedia.org/wiki/Binocular_disparity

The left view is usually considered to be the ‘camera frame’, i.e. this is the frame that is used to define the camara extrinsics matrix

Classes

Camera()

Base class for all cameras

DepthCamera()

Base class for all depth cameras

RGBCamera()

Base class for all RGB cameras

RGBDCamera()

Base class for all RGBD cameras

StereoRGBDCamera()

Base class for all stereo RGBD cameras