Data & Analytics

ZoeDepth

ZoeDepth is an advanced, open-source monocular depth estimation model developed by researchers at Intel Labs and the University of Toronto. It transforms a single 2D image into a detailed depth map, effectively creating a 3D representation of the scene. Unlike earlier models that offered a one-size-fits-all approach, ZoeDepth introduces a novel multi-head architecture with separate encoders for metric and relative depth estimation, allowing it to produce highly accurate, metric-aware depth predictions without requiring camera intrinsics. It is designed for robustness across diverse scenes, from indoor environments to outdoor landscapes. The model is particularly valuable for applications in robotics, augmented reality, 3D reconstruction, and computational photography, where understanding scene geometry from a single viewpoint is critical. Its release as a pre-trained model on GitHub makes state-of-the-art depth estimation accessible to developers, researchers, and hobbyists for integration into various projects.

Visit Website

📊 At a Glance

Pricing: Paid
Reviews: No reviews
Traffic: N/A
Engagement: 0🔥
0👁️

Key Features

Metric-Aware Depth Estimation

Produces depth maps in absolute, real-world metric units (e.g., meters) from a single RGB image, without requiring camera calibration parameters as input.

Multi-Head Transformer Architecture

Employs a dual-branch design with separate transformer-based heads trained for metric and relative depth tasks, whose outputs are adaptively fused.

Zero-Shot Cross-Dataset Generalization

The pre-trained models demonstrate strong performance on datasets they were not explicitly trained on, handling varied indoor and outdoor scenes robustly.

High-Resolution Output

Generates dense, detailed depth maps that preserve fine structures and object boundaries at the native resolution of the input image.

Comprehensive Training Framework

The open-source release includes not just inference code but full training scripts, loss functions, and data loaders.

Pricing

Open Source / Research

✓Full access to all pre-trained model weights (ZoeD_N, ZoeD_K, ZoeD_NK).
✓Complete source code for inference, training, and evaluation.
✓Freedom to use, modify, and distribute for any purpose, including commercial, under the MIT License.
✓No user limits, seat restrictions, or API quotas.
✓Community support via GitHub Issues.

Use Cases

Augmented Reality (AR) Scene Understanding

AR developers use ZoeDepth to understand the 3D geometry of a user's environment from their smartphone camera feed. The metric depth map allows virtual objects to be placed with correct scale, occlusion, and lighting interactions, making them appear anchored in the real world. This enhances user immersion in applications ranging from gaming to furniture visualization.

Robotics Navigation and Manipulation

Roboticists integrate ZoeDepth into perception systems for drones or mobile robots. By estimating the distance to obstacles, floors, and objects from a single camera, robots can perform tasks like obstacle avoidance, path planning, and bin picking without needing expensive LiDAR or stereo cameras. This reduces system cost and complexity while maintaining robust spatial awareness.

Computational Photography & Smartphone Cameras

Camera software engineers employ ZoeDepth to enable portrait-mode effects and advanced image editing. By creating an accurate depth map of a scene, the software can selectively blur backgrounds (bokeh), apply layer-based filters, or simulate refocusing after a photo is taken. This brings professional-grade photographic effects to consumer devices.

3D Content Creation and Reconstruction

Digital artists and game developers use ZoeDepth to quickly generate 3D proxies or rough geometry from reference images or concept art. The depth map can be converted into a point cloud or mesh, serving as a foundational scaffold for detailed 3D modeling, speeding up the asset creation pipeline for games, films, and VR experiences.

Autonomous Driving Perception

While primary systems rely on LiDAR and radar, ZoeDepth can serve as a complementary, cost-effective visual perception module. It helps in estimating the distance to vehicles, pedestrians, and curbs from monocular dashcam footage, providing valuable redundancy and context for tasks like free-space detection and object tracking, especially in resource-constrained scenarios.

How to Use

Step 1: Set up your Python environment (Python 3.8+ recommended) and install the required dependencies, primarily PyTorch and the `zoedepth` package via pip (`pip install zoedepth`).
Step 2: Clone the official GitHub repository (`git clone https://github.com/isl-org/ZoeDepth`) to access demo scripts, utilities, and the latest codebase.
Step 3: Choose and load a pre-trained ZoeDepth model variant (e.g., `ZoeD_N` for NYU Depth V2, `ZoeD_K` for KITTI, or `ZoeD_NK` for a combined model) using the provided Python API.
Step 4: Prepare your input image as a PIL Image or NumPy array, ensuring it is in RGB format and optionally pre-processed (e.g., resized) as required by the specific model variant.
Step 5: Pass the image through the model's `infer` method to generate a depth map. The output is a depth array where each pixel value represents the estimated distance from the camera.
Step 6: Visualize or post-process the raw depth output. The repository includes utilities for creating color-mapped depth images for interpretation or saving the depth data as a file (e.g., .npy or .png).
Step 7: Integrate the depth estimation pipeline into your application. This could involve processing video frames, combining depth with other CV tasks, or using the depth data for 3D scene understanding.
Step 8: For advanced use, explore fine-tuning the model on a custom dataset using the training scripts provided in the repository, adjusting hyperparameters for your specific domain.

Reviews & Ratings

No reviews yet

Alternatives

15Five

15Five operates in the people analytics and employee experience space, where platforms aggregate HR and feedback data to give organizations insight into their workforce. These tools typically support engagement surveys, performance or goal tracking, and dashboards that help leaders interpret trends. They are intended to augment HR and management decisions, not to replace professional judgment or context. For specific information about 15Five's metrics, integrations, and privacy safeguards, you should refer to the vendor resources published at https://www.15five.com.

Data & Analytics

Data Analysis Tools

See Pricing

View Details

20-20 Technologies

20-20 Technologies is a comprehensive interior design and space planning software platform primarily serving kitchen and bath designers, furniture retailers, and interior design professionals. The company provides specialized tools for creating detailed 3D visualizations, generating accurate quotes, managing projects, and streamlining the entire design-to-sales workflow. Their software enables designers to create photorealistic renderings, produce precise floor plans, and automatically generate material lists and pricing. The platform integrates with manufacturer catalogs, allowing users to access up-to-date product information and specifications. 20-20 Technologies focuses on bridging the gap between design creativity and practical business needs, helping professionals present compelling visual proposals while maintaining accurate costing and project management. The software is particularly strong in the kitchen and bath industry, where precision measurements and material specifications are critical. Users range from independent designers to large retail chains and manufacturing companies seeking to improve their design presentation capabilities and sales processes.

Data & Analytics

Computer Vision

Paid

View Details

3D Generative Adversarial Network

3D Generative Adversarial Network (3D-GAN) is a pioneering research project and framework for generating three-dimensional objects using Generative Adversarial Networks. Developed primarily in academia, it represents a significant advancement in unsupervised learning for 3D data synthesis. The tool learns to create volumetric 3D models from 2D image datasets, enabling the generation of novel, realistic 3D shapes such as furniture, vehicles, and basic structures without explicit 3D supervision. It is used by researchers, computer vision scientists, and developers exploring 3D content creation, synthetic data generation for robotics and autonomous systems, and advancements in geometric deep learning. The project demonstrates how adversarial training can be applied to 3D convolutional networks, producing high-quality voxel-based outputs. It serves as a foundational reference implementation for subsequent work in 3D generative AI, often cited in papers exploring 3D shape completion, single-view reconstruction, and neural scene representation. While not a commercial product with a polished UI, it provides code and models for the research community to build upon.

Data & Analytics

Computer Vision

Paid

View Details

Visit Website

At a Glance

Pricing Model: Paid

Visit Website

Data & Analytics

ZoeDepth

Visit Website

📊 At a Glance

Pricing: Paid
Reviews: No reviews
Traffic: N/A
Engagement: 0🔥
0👁️

Key Features

Metric-Aware Depth Estimation

Produces depth maps in absolute, real-world metric units (e.g., meters) from a single RGB image, without requiring camera calibration parameters as input.

Multi-Head Transformer Architecture

Employs a dual-branch design with separate transformer-based heads trained for metric and relative depth tasks, whose outputs are adaptively fused.

Zero-Shot Cross-Dataset Generalization

The pre-trained models demonstrate strong performance on datasets they were not explicitly trained on, handling varied indoor and outdoor scenes robustly.

High-Resolution Output

Generates dense, detailed depth maps that preserve fine structures and object boundaries at the native resolution of the input image.

Comprehensive Training Framework

The open-source release includes not just inference code but full training scripts, loss functions, and data loaders.

Pricing

Open Source / Research

✓Full access to all pre-trained model weights (ZoeD_N, ZoeD_K, ZoeD_NK).
✓Complete source code for inference, training, and evaluation.
✓Freedom to use, modify, and distribute for any purpose, including commercial, under the MIT License.
✓No user limits, seat restrictions, or API quotas.
✓Community support via GitHub Issues.

Use Cases

Augmented Reality (AR) Scene Understanding

Robotics Navigation and Manipulation

Computational Photography & Smartphone Cameras

3D Content Creation and Reconstruction

Autonomous Driving Perception

How to Use

Step 1: Set up your Python environment (Python 3.8+ recommended) and install the required dependencies, primarily PyTorch and the `zoedepth` package via pip (`pip install zoedepth`).
Step 2: Clone the official GitHub repository (`git clone https://github.com/isl-org/ZoeDepth`) to access demo scripts, utilities, and the latest codebase.
Step 3: Choose and load a pre-trained ZoeDepth model variant (e.g., `ZoeD_N` for NYU Depth V2, `ZoeD_K` for KITTI, or `ZoeD_NK` for a combined model) using the provided Python API.
Step 4: Prepare your input image as a PIL Image or NumPy array, ensuring it is in RGB format and optionally pre-processed (e.g., resized) as required by the specific model variant.
Step 5: Pass the image through the model's `infer` method to generate a depth map. The output is a depth array where each pixel value represents the estimated distance from the camera.
Step 6: Visualize or post-process the raw depth output. The repository includes utilities for creating color-mapped depth images for interpretation or saving the depth data as a file (e.g., .npy or .png).
Step 7: Integrate the depth estimation pipeline into your application. This could involve processing video frames, combining depth with other CV tasks, or using the depth data for 3D scene understanding.
Step 8: For advanced use, explore fine-tuning the model on a custom dataset using the training scripts provided in the repository, adjusting hyperparameters for your specific domain.