# Pointcloud Uses and Formats

## Overview

Point clouds serve three major roles in the realtime.earth platform:

### 1. Agent Representation (AgentScript)

Point clouds can optionally represent agents in AgentScript simulations:

- **Turtles**: Mobile agents as point samples. The point cloud stores all agent properties (position, heading, color, custom attributes). Higher-dimensional fields can be encoded via spherical harmonics associated with each point/splat.
- **Patches**: Grid cells as point samples at cell centers. The point cloud stores all patch properties (elevation, temperature, density, custom attributes). Higher-dimensional fields can be encoded via spherical harmonics.
- **Links**: Connections between turtles. The point cloud stores all link properties. In the primal representation, links are edges between turtle points. In the dual representation, edges become points and turtles become edges; faces or groups of turtles become points with edges to adjacent faces.

This bridges agent-based modeling with 3D visualization. An AgentScript model running in the browser can export its current state as a point cloud, enabling:

- Real-time 3D visualization of simulations
- Fusion with sensor data (overlay agents on scanned geometry)
- Temporal recording of agent trajectories

### 2. World Geometry

Point clouds represent the physical world as collections of 3D samples. This is the accumulated structure of the scene, built from sensors, scans, or reconstruction:

- **Sources**: LiDAR, photogrammetry, multi-view stereo, depth sensors
- **Attributes**: Position (x, y, z), color, normals, intensity
- **Scale**: From thousands (single frame) to billions (survey-scale)

A Gaussian splat is a point cloud with richer per-point data (covariance matrix, spherical harmonics for view-dependent color).

### 3. Camera as Sample Point

A camera position is itself a point sample in the plenoptic field, where light flows in both directions:

**Depth Projection (Outward)**

Every camera pixel can project to a 3D point when depth is known. This is the UVW mapping:

```
Image pixel (u, v) + depth (w) → World point (x, y, z)
```

- **u, v**: Pixel coordinates in image space
- **w**: Depth value (distance from camera)
- **Projection**: Camera intrinsics (FOV, principal point) define the ray direction

Depth cameras (Kinect, RealSense) produce depth directly. RGB cameras require depth estimation (stereo, learned, or manual).

**Example: StreamTable Kinect Visualization**

A Kinect depth camera projecting a point cloud onto a stream table at Harvard GSD:

[![StreamTable Kinect Visualization](streamtable-viz-screenshot.png)](https://waldo.acequia.io/gsd-vis-2487/video-explanations/streamtable-viz-with-kinect-depth-crf27.mp4)

[streamtable-viz video walkthrough of using pointclouds](https://waldo.acequia.io/gsd-vis-2487/video-explanations/streamtable-viz-with-kinect-depth-crf27.mp4)

**Live app:** [streamtable-beta.html](https://waldo.acequia.io/gsd-vis-2487/streamtable-viz/streamtable-beta.html)

**Data sources:**
- GLB mesh: Photogrammetry from [Polycam](https://poly.cam)
- PLY point cloud: [depth_00050.ply](https://waldo.acequia.io/gsd-vis-2487/streamtable-viz/depth_00050.ply) (Azure Kinect depth frame)

**Light Probes (Inward)**

The camera also receives light from all directions. A panosphere captures this as a spherical function:

- **Panosphere**: Equirectangular image encoding incoming radiance from all directions
- **Light probe**: Camera position as a point sample with spherical harmonics (SH) encoding the environment
- **Compression**: Instead of storing full panosphere pixels, store SH coefficients (9-27 floats for bands 0-2)

This parallels Gaussian splats: a world splat radiates light outward (SH encodes view-dependent appearance); a camera sample point receives light inward (panosphere or SH encodes the environment).

These three roles are complementary: agents move through the world as dynamic point samples; cameras sample the plenoptic field bidirectionally (depth out, environment in); the accumulated points form the world geometry.

### Representation Formats

- **Native interchange**: PLY, PCD, XYZ (position + basic attributes)
- **Streaming/LOD**: Potree, 3D Tiles (hierarchical, massive scale)
- **Cloud-native**: COPC, EPT (HTTP byte-range optimized, like COG for rasters)
- **Temporal**: LAS 1.4 with GPS time, E57, MKV, HDF5 (time-varying data)
- **GPU-optimized**: Binary buffers, Gaussian splats (direct to GPU)

### Interchange: Goes Inta / Goes Outa

Point cloud formats serve as interchange with external platforms:

**Goes Inta (Import):**
| Source | Format | Notes |
|--------|--------|-------|
| Azure Kinect | PLY, MKV depth | Per-frame extraction |
| LiDAR scanners | LAS/LAZ | Industry standard |
| Photogrammetry (Metashape, RealityCapture) | PLY, PCD, LAS | Dense reconstruction |
| iPhone/iPad LiDAR | PLY, USDZ | ARKit export |
| 3DGS training (COLMAP, Nerfstudio) | PLY with splat attrs | Gaussian splats |

**Goes Outa (Export):**
| Destination | Format | Notes |
|-------------|--------|-------|
| Archive/interchange | PLY, LAS | Long-term storage |
| Web viewers | Potree, 3D Tiles | Streaming at scale |
| CAD/BIM tools | E57, LAS | Industry interop |
| Machine learning | Binary buffers, HDF5 | Training data |
| Game engines (Unity, Unreal) | PLY, custom binary | Asset import |

---

## Native/Simple Formats

### PLY (Polygon File Format)
- **Extension:** `.ply`
- **Origin:** Stanford Graphics Lab
- **Structure:** Header + data (ASCII or binary)
- **Attributes:** Position, color (RGB/RGBA), normals, custom properties
- **Three.js loader:** `PLYLoader`
- **Pros:** Simple, widely supported, flexible header
- **Cons:** No built-in LOD, can be large for big datasets

```
ply
format ascii 1.0
element vertex 3
property float x
property float y
property float z
property uchar red
property uchar green
property uchar blue
end_header
0.0 0.0 0.0 255 0 0
1.0 0.0 0.0 0 255 0
0.0 1.0 0.0 0 0 255
```

### PCD (Point Cloud Data)
- **Extension:** `.pcd`
- **Origin:** Point Cloud Library (PCL)
- **Structure:** Header + data (ASCII, binary, or binary_compressed)
- **Attributes:** Position, color, normals, intensity, custom fields
- **Three.js loader:** `PCDLoader`
- **Pros:** Designed for point clouds, supports compression
- **Cons:** Less common outside robotics/CV community

### XYZ / XYZRGB
- **Extension:** `.xyz`, `.txt`
- **Structure:** Plain text, space-separated values
- **Attributes:** Position only (XYZ) or position + color (XYZRGB)
- **Three.js loader:** Custom (trivial to parse)
- **Pros:** Dead simple, human readable
- **Cons:** No header, no metadata, inefficient

```
0.0 0.0 0.0
1.0 0.0 0.0
0.0 1.0 0.0
```

---

## Optimized/Streaming Formats

### Potree (Potree Octree)
- **Extension:** Directory structure with `metadata.json` + `.bin` chunks
- **Origin:** Potree project (TU Wien)
- **Structure:** Hierarchical octree, chunked binary nodes
- **Attributes:** Position, color, intensity, classification, etc.
- **Three.js integration:** Potree library (wraps three.js)
- **Pros:** Built for massive datasets, progressive LOD, streaming
- **Cons:** Requires preprocessing, complex structure

### 3D Tiles (Cesium)
- **Extension:** `.json` tileset + `.pnts` tiles
- **Origin:** Cesium / OGC standard
- **Structure:** Hierarchical spatial index, binary point tiles
- **Attributes:** Position, color, normals, batch IDs
- **Three.js loader:** `3DTilesLoader` (NASA AMMOS, loaders.gl)
- **Pros:** OGC standard, massive scale, geospatial focus
- **Cons:** Complex, primarily for geo applications

### LAS/LAZ (LiDAR)
- **Extension:** `.las`, `.laz` (compressed)
- **Origin:** ASPRS (American Society for Photogrammetry)
- **Structure:** Binary with header, point records
- **Attributes:** Position, intensity, classification, RGB, GPS time, etc.
- **Three.js loader:** `LASLoader` (via loaders.gl or custom)
- **Pros:** Industry standard for LiDAR, rich metadata
- **Cons:** Requires parsing library, no native LOD

---

## GPU-Friendly Formats

### Splat (Gaussian Splatting)
- **Extension:** `.splat`, `.ply` (with splat attributes)
- **Origin:** 3D Gaussian Splatting research
- **Structure:** Points with covariance matrices and spherical harmonics
- **Attributes:** Position, scale, rotation, opacity, SH coefficients
- **Three.js loader:** Various community loaders
- **Pros:** Photorealistic rendering, view-dependent effects
- **Cons:** Larger per-point data, specialized rendering pipeline

### Binary Buffer (Custom)
- **Extension:** `.bin`, `.buffer`
- **Structure:** Raw typed arrays (Float32Array, Uint8Array)
- **Attributes:** Whatever you encode
- **Three.js:** Direct to `BufferGeometry`
- **Pros:** Fastest load, minimal parsing
- **Cons:** No metadata, format must be documented separately

---

## Cloud-Native Formats

Cloud-native formats are optimized for HTTP byte-range requests, enabling efficient streaming without downloading entire files. These are the point cloud equivalent of Cloud-Optimized GeoTIFF (COG) for raster data.

### COPC (Cloud-Optimized Point Cloud)
- **Extension:** `.copc.laz`
- **Origin:** PDAL/Hobu (2021)
- **Structure:** Single LAZ 1.4 file with VLR-based octree index at file start
- **Attributes:** All LAS attributes (position, intensity, classification, RGB, GPS time)
- **Tools:** PDAL, QGIS 3.26+, CloudCompare
- **Spec:** https://copc.io
- **Pros:** Single file, HTTP range requests, standard LAZ compression, wide tool support
- **Cons:** Requires LAZ 1.4 support

```
┌─────────────────────────────────┐
│  LAZ Header + VLR Index         │  ← Read first (octree structure)
├─────────────────────────────────┤
│  Octree Node 0 (root)           │  ← Coarse LOD
│  Octree Node 1                  │
│  Octree Node 2                  │  ← Fetch specific nodes
│  ...                            │     via byte-range requests
│  Octree Node N (leaves)         │  ← Fine detail
└─────────────────────────────────┘
```

### EPT (Entwine Point Tiles)
- **Extension:** Directory with `ept.json` + binary chunks
- **Origin:** Hobu/PDAL ecosystem
- **Structure:** Octree with each node as separate file
- **Attributes:** Configurable per-dataset
- **Tools:** PDAL, Potree, entwine CLI
- **Pros:** Simple HTTP serving (static files), no range requests needed
- **Cons:** Many small files, requires directory structure

### Cloud-Native Comparison

| Aspect | COG (raster) | COPC (point cloud) | EPT |
|--------|--------------|-------------------|-----|
| Base format | GeoTIFF | LAZ 1.4 | Custom binary |
| Index location | IFD at file start | VLR at file start | Separate JSON |
| Spatial structure | Tiled pyramids | Octree | Octree |
| Range requests | Yes | Yes | No (separate files) |
| Single file | Yes | Yes | No |
| Compression | DEFLATE/LZW/JPEG | LAZ (lossless) | Optional |

---

## Temporal Formats

Formats that handle time-varying point cloud data (4D: x, y, z, t).

### LAS 1.4 with GPS Time
- **Extension:** `.las`, `.laz`
- **Time field:** GPS Time (per-point timestamp)
- **Resolution:** Microsecond precision
- **Use case:** LiDAR surveys, mobile mapping
- **Notes:** Standard LAS attribute, widely supported

### E57
- **Extension:** `.e57`
- **Origin:** ASTM International
- **Structure:** XML metadata + binary point data
- **Time support:** Per-scan timestamps, structured scans
- **Use case:** Terrestrial laser scanning, BIM
- **Pros:** Multi-scan support, rich metadata, industry standard
- **Cons:** Complex format, limited streaming support

### ROS Bag
- **Extension:** `.bag`, `.mcap`
- **Origin:** Robot Operating System
- **Structure:** Time-indexed message streams
- **Time support:** Nanosecond timestamps, topic-based organization
- **Use case:** Robotics, autonomous vehicles, sensor fusion
- **Tools:** ROS, rosbag, MCAP libraries
- **Pros:** Multi-sensor sync, arbitrary message types
- **Cons:** ROS ecosystem dependency

### MKV (Azure Kinect)
- **Extension:** `.mkv`
- **Origin:** Matroska container (Azure Kinect SDK)
- **Structure:** Multiplexed streams (color, depth, IR, IMU)
- **Time support:** Per-frame timestamps, synchronized streams
- **Use case:** Depth camera recording, body tracking
- **Extraction:** Azure Kinect SDK, ffmpeg (color only)

### HDF5
- **Extension:** `.h5`, `.hdf5`
- **Origin:** HDF Group
- **Structure:** Hierarchical datasets with arbitrary arrays
- **Time support:** Custom time dimensions, chunked access
- **Use case:** Scientific data, ML training sets, large archives
- **Pros:** Flexible schema, compression, partial reads
- **Cons:** Complex API, not point-cloud-specific

### USD (Universal Scene Description)
- **Extension:** `.usd`, `.usda`, `.usdc`, `.usdz`
- **Origin:** Pixar
- **Structure:** Scene graph with time-sampled attributes
- **Time support:** Native animation timeline, per-attribute keyframes
- **Use case:** VFX, game engines, AR/VR
- **Pros:** Industry standard, Apple ecosystem support
- **Cons:** Complex, primarily for authored content

---

## Format Comparison

| Format | LOD | Streaming | Cloud-Native | Time | Complexity | Three.js Support |
|--------|-----|-----------|--------------|------|------------|------------------|
| PLY | No | No | No | No | Low | Native loader |
| PCD | No | No | No | No | Low | Native loader |
| XYZ | No | No | No | No | Minimal | Custom (easy) |
| Potree | Yes | Yes | Partial | No | High | Potree library |
| 3D Tiles | Yes | Yes | Yes | No | High | loaders.gl |
| LAS/LAZ | No | Partial | No | GPS | Medium | loaders.gl |
| COPC | Yes | Yes | Yes | GPS | Medium | loaders.gl |
| EPT | Yes | Yes | Yes | No | Medium | Potree |
| Splat | No | No | No | No | Medium | Community |
| Binary | No | Yes | No | Custom | Low | Direct |
| E57 | No | No | No | Yes | High | Custom |
| HDF5 | No | Partial | No | Yes | High | Custom |
| USD | No | Partial | No | Yes | High | USD libraries |

---

## Recommendations

**Small datasets (<1M points):** PLY or PCD, simple and well-supported.

**Large datasets (>10M points):** Potree or 3D Tiles, preprocessing required but enables streaming and LOD.

**Cloud hosting (S3, CDN):** COPC for single-file deployment with HTTP range requests. EPT if you prefer static file serving without range request support.

**LiDAR data:** LAS/LAZ for interchange, COPC for cloud storage, convert to Potree for web display.

**Time-varying data:** LAS 1.4 with GPS time for LiDAR sequences. MKV for depth camera recordings. HDF5 for large scientific archives.

**Multi-sensor fusion:** ROS bag or MCAP for robotics. MKV for Kinect with synchronized RGB/depth/IMU.

**Photorealistic:** Gaussian splats for view-dependent rendering.

**Maximum performance:** Pre-convert to binary buffers matching your BufferGeometry layout.

---

## Azure Kinect Depth Modes

| Mode | H-FOV | V-FOV | Resolution | Range |
|------|-------|-------|------------|-------|
| NFOV unbinned | 75° | 65° | 640×576 | 0.5-3.86m |
| NFOV 2x2 binned | 75° | 65° | 320×288 | 0.5-5.46m |
| WFOV 2x2 binned | 120° | 120° | 512×512 | 0.25-2.88m |
| WFOV unbinned | 120° | 120° | 1024×1024 | 0.25-2.21m |

---

## Notes

- Three.js `Points` material supports point size attenuation, vertex colors
- For millions of points, consider instanced rendering or custom shaders
- WebGPU (three.js r160+) improves point cloud performance significantly