pinhole_operations

Pinhole camera projection and back-projection utilities.

robotblockset.cameras.pinhole_operations.project_points_to_image_plane(positions_in_camera_frame: ndarray, camera_intrinsics: ndarray) → ndarray[source]

Project 3D camera-frame points to image pixels.

Parameters:

positions_in_camera_frame (Union[Vector3DArrayType, Vector3DType]) – 3D point positions in the camera frame.
camera_intrinsics (CameraIntrinsicsMatrixType) – Camera intrinsics matrix with shape (3, 3).

Returns:

Pixel coordinates on the image plane.

Return type:

Vector2DArrayType

robotblockset.cameras.pinhole_operations.multiview_triangulation_midpoint(extrinsics_matrices: List[ndarray], intrinsics_matrices: List[ndarray], image_coordinates: ndarray) → ndarray[source]

Triangulate a point from multiple views using the midpoint method.

Parameters:

extrinsics_matrices (List[CameraExtrinsicMatrixType]) – Extrinsics matrices for each viewpoint.
intrinsics_matrices (List[CameraIntrinsicsMatrixType]) – Intrinsics matrices for each viewpoint.
image_coordinates (Vector2DArrayType) – Image coordinates of the 3D point for each viewpoint.

Returns:

Estimated 3D position in the world frame.

Return type:

Vector3DType

robotblockset.cameras.pinhole_operations.calculate_triangulation_errors(extrinsics_matrices: List[ndarray], intrinsics_matrices: List[ndarray], image_coordinates: ndarray, point: ndarray) → List[float][source]

Compute per-view triangulation errors.

Parameters:

extrinsics_matrices (List[CameraExtrinsicMatrixType]) – Extrinsics matrices for each viewpoint.
intrinsics_matrices (List[CameraIntrinsicsMatrixType]) – Intrinsics matrices for each viewpoint.
image_coordinates (Vector2DArrayType) – Image coordinates of the 3D point for each viewpoint.
point (Vector3DType) – Estimated 3D point in the world frame.

Returns:

Euclidean distances between the point and each viewing ray.

Return type:

List[float]

robotblockset.cameras.pinhole_operations.unproject_using_depthmap(image_coordinates: ndarray, depth_map: ndarray, camera_intrinsics: ndarray, depth_heuristic_mask_size: int = 11, depth_heuristic_percentile: float = 0.05) → ndarray[source]

Unprojects image coordinates to 3D positions using a depth map.

Parameters:

image_coordinates (Vector2DArrayType) – numpy array of shape (N, 2) containing the 2D pixel coordinates of the points on the image plane
depth_map (NumpyDepthMapType) – numpy array of shape (height, width) containing the depth values for each pixel, i.e. the z-value of the 3D point in the camera frame corresponding to the pixel
camera_intrinsics (CameraIntrinsicsMatrixType) – camera intrinsics matrix as a numpy array of shape (3, 3)

Return type:

numpy array of shape (N, 3) containing the 3D positions of the points in the camera frame

robotblockset.cameras.pinhole_operations.unproject_onto_depth_values(image_coordinates: ndarray, depth_values: ndarray, camera_intrinsics: ndarray) → ndarray[source]

Unprojects image coordinates to 3D positions using depth values for each coordinate.

Parameters:

image_coordinates (Vector2DArrayType) – numpy array of shape (N, 2) containing the 2D pixel coordinates of the points on the image plane
depth_values (np.ndarray) – numpy array of shape (N,) containing the depth values for each pixel, i.e. the z-value of the 3D point in the camera frame corresponding to the pixel
camera_intrinsics (CameraIntrinsicsMatrixType) – camera intrinsics matrix as a numpy array of shape (3, 3)

Return type:

numpy array of shape (N, 3) containing the 3D positions of the points in the camera frame

robotblockset.cameras.pinhole_operations.unproject_onto_world_z_plane(image_coordinates: ndarray, camera_intrinsics: ndarray, camera_in_frame_pose: ndarray, height: float) → ndarray[source]

Unproject image points onto a world Z plane.

Parameters:

image_coordinates (Vector2DArrayType) – Pixel coordinates on the image plane.
camera_intrinsics (CameraIntrinsicsMatrixType) – Camera intrinsics matrix with shape (3, 3).
camera_in_frame_pose (HomogeneousMatrixType) – Camera pose in the target frame.
height (float) – Height of the plane in the target frame.

Returns:

3D points on the plane Z = height in the target frame.

Return type:

Vector3DArrayType

robotblockset.cameras.pinhole_operations.extract_depth_from_depthmap_heuristic(image_coordinates: ndarray, depth_map: ndarray, mask_size: int = 11, depth_percentile: float = 0.05) → ndarray[source]

A simple heuristic to get more robust depth values of the depth map. Especially with keypoints we are often interested in points on the edge of an object, or even worse on a corner. Not only are these regions noisy by themselves but the keypoints could also be be a little off.

This function takes the percentile of a region around the specified point and assumes we are interested in the nearest object present. This is not always true (think about the backside of a box looking under a 45 degree angle) but it serves as a good proxy. The more confident you are of your keypoints and the better the heatmaps are, the lower you could set the mask size and percentile. If you are very, very confident you could directly take the point cloud as well instead of manually querying the heatmap, but I find that they are more noisy.

Also note that this function assumes there are no negative infinity values (no objects closer than 30cm!)

Return type:: (np.ndarray) a 1D array of the depth values for the specified coordinates

Functions

`calculate_triangulation_errors`(...)	Compute per-view triangulation errors.
`extract_depth_from_depthmap_heuristic`(...[, ...])	A simple heuristic to get more robust depth values of the depth map.
`multiview_triangulation_midpoint`(...)	Triangulate a point from multiple views using the midpoint method.
`project_points_to_image_plane`(...)	Project 3D camera-frame points to image pixels.
`unproject_onto_depth_values`(...)	Unprojects image coordinates to 3D positions using depth values for each coordinate.
`unproject_onto_world_z_plane`(...)	Unproject image points onto a world Z plane.
`unproject_using_depthmap`(image_coordinates, ...)	Unprojects image coordinates to 3D positions using a depth map.