Sifflet
Business-aware data observability platform connecting data quality to business impact.
lakeFS is a data version control platform that manages the data lifecycle, provenance, and unified access for AI and data teams.

lakeFS is a data version control system that brings Git-like capabilities to data lakes and object storage. It enables data teams to manage data as code, providing features such as branching, merging, and reverting for data. This allows for experimentation, reproducibility, and data quality enforcement. lakeFS supports various storage solutions like Amazon S3, Google Cloud Storage, and Azure Blob Storage. It integrates with compute engines such as Spark, Trino, and Databricks, and is format-agnostic, working with Parquet, CSV, Avro, and more. lakeFS is designed for data engineers, data scientists, and MLOps practitioners who need to manage large datasets, ensure data quality, and streamline data workflows for AI and machine learning projects.
lakeFS is a data version control system that brings Git-like capabilities to data lakes and object storage.
Explore all tools that specialize in version control for data lakes. This domain focus ensures lakeFS delivers optimized results for this specific requirement.
Explore all tools that specialize in branching and merging data. This domain focus ensures lakeFS delivers optimized results for this specific requirement.
Explore all tools that specialize in reproducible data pipelines. This domain focus ensures lakeFS delivers optimized results for this specific requirement.
Explore all tools that specialize in data quality enforcement. This domain focus ensures lakeFS delivers optimized results for this specific requirement.
Explore all tools that specialize in collaboration on data projects. This domain focus ensures lakeFS delivers optimized results for this specific requirement.
Explore all tools that specialize in data lineage tracking. This domain focus ensures lakeFS delivers optimized results for this specific requirement.
Creates isolated copies of data for experimentation and development without affecting the production data.
Tracks changes to data over time, allowing users to revert to previous versions if needed.
Combines changes from different data branches into a single, consistent dataset.
Tracks the origin and transformation of data, providing visibility into the entire data lifecycle.
Manages user permissions and access to data resources.
Install lakeFS using Docker or Kubernetes.
Configure lakeFS to connect to your object storage (e.g., Amazon S3).
Create a repository in lakeFS to store your data.
Import your existing data into the lakeFS repository.
Create a branch to isolate your changes.
Modify your data and commit the changes to your branch.
Merge your branch back into the main branch to apply your changes.
All Set
Ready to go
Verified feedback from other users.
"Users praise lakeFS for its ability to streamline data science and MLOps workflows, improve robustness and flexibility of data systems, and reduce testing time."
0Post questions, share tips, and help other users.
Business-aware data observability platform connecting data quality to business impact.

The Enterprise AI Trust Platform built on lineage-enabled data observability.

Enterprise-grade data governance and metadata management for hybrid-cloud ecosystems.
Activeloop Deep Lake is the AI data plane that allows you to store, retrieve, replay, and fine-tune AI agent interactions for continual learning.
Apache Avro is a data serialization system providing rich data structures and a compact, fast, binary data format.
DataGroomr is an AI-powered solution that makes Salesforce data quality fast, accurate, and effortless.
Data.world is an enterprise data catalog that helps organizations turn data chaos into clarity, enabling better data discovery, governance, and AI initiatives.