Cerebrus
Cerebrus
Litmus7’s Metadata Driven Data Quality Engine
The key driver of innovation and competitiveness – ‘Data‘ is said to be the new oil and the backbone for sustainable growth. Most organizations are now moving towards a data-driven approach and hence the quality of data being used is of utmost importance. Since there is a huge amount of data being collected from a wide variety of sources, there is a high chance for incorrect or incomplete data getting ingested due to manual or machine errors. This ultimately impacts the sales revenue, customer relationships, as well as the brand reputation.
The role of Cerebrus
A metadata-driven data quality engine developed by Litmus7, Cerebrus aims to tackle the problem of inaccurate and incomplete data. It offers a comprehensive solution to address the challenges associated with data quality. Users can ensure impeccable data quality by integrating Cerebrus into their data management processes. It will help ensure seamless data movement from any source to destination, powering businesses with faster, more reliable and consistent data.
Key Features
User Friendly Metadata registry UI
Register column descriptions for quick reference
Minimal Manual Effort
Built-in and ready to use with very minimal user inputs. Ability for non-technical users to define data assertions
Column and Row level detailed checks
Column level checks such as Data type, Size, Values, Range, Pattern, Action etc. Each row level checks for the set conditions
Platform Agnostic DQ engines
Python and Spark based engine to suit your DE engine
Customizable DQ functions
Leverages custom defined DQ functions for detailed data quality checks. Automatically converts rules into functions
Proactive Actions and Thresholds
Control actions and thresholds based on DQ events at record/dataset level
Smart Audit Log
Audit log at each record level data for runtime capture of DQ rules with error message. Graphical representation of Data Quality output
Advantages of Cerebrus
Cerebrus provides a user interface which is simple to use even for non-technical users. So, more control can be given to the business user for defining data quality rules without getting involved in the technical aspects happening in the background for data validation. The user has the ability to upload a sample data file, based on which Cerebrus will infer many of the details like column name, data type etc. In addition, the user will be able to define more validation rules such as uniqueness, applicable values, valid range, expected pattern, field length etc. This validation metadata could be persisted into a database, which could be used multiple times to validate the actual source data. Upon running the data quality engine over the source data using already registered metadata, valid and invalid records are separated into different files/tables. Another catch is that an audit file is also generated which will capture the detailed information about validation failures. This audit file is leveraged to provide multiple insights to the user like most successful column, most frequently failing column, most frequently failing validation constraint etc.
Key Benefits
Improved accuracy and reliability of data
Empowering business users and data owners to define data assertions
Enhanced data stewardship and power to business
Time & Cost savings
Automated enforcement of
Compliance and Governance