This document provides users with an understanding of the key components that make up the Naveego Platform.
A source is a repository of data that the user wants to master. This includes data sources both on-premise (on client network) or in the cloud. Examples of data sources include Microsoft SQL Server, MongoDB, Salesforce, Oracle and many more.
Examples of Data Sources
An Agent is a lightweight software application that facilitates secure communication between the client’s data sources and the Naveego Platform. For on-premise data sources, the agent must be running on the same client network where the desired data sources reside.
All Naveego Platform instances come with a Cloud Agent out of the box, allowing you to connect to any cloud-hosted data sources without the need to install an agent.
A Plugin is a software component executed by the Agent for a specific type of data source. A plugin knows how to communicate with the data source and allows for quick and easy integration between the client’s data sources and the Naveego Platform.
When connecting to Microsoft SQL Server, you would select the MSSQL plugin. When connecting to Oracle, you would select the Oracle plugin.
A connection is a combination of a client data source, a plug-in for that data source, and an Agent. The connection is a configuration put in place by the user that provides all necessary information for the Naveego Platform to establish a link to a data source and subsequently ingest data from or send data to a specific data source.
An example of 3 connections to a single Agent
A shape is a logical data entity as defined by the business. Shapes are data source agnostic entities where data elements from any and all data sources are combined and organized.
Every shape has Match and Merge rules configured that will allow the system to group duplicate records and merge them together, taking the best available data from each source system.
The result of this process is a list of Golden Records that are a combination of records from multiple data sources, with only the best, most trusted data promoted to the ‘known truth’ that is the Golden Record.
Jobs are used to trigger the movement of data either into (Read) or out of (Writeback, Replication) the Naveego Platform. Jobs can be run on-demand, on a schedule, or in real-time for platforms that support it.
There are three types of jobs:
Read – A read job is triggered to bring data into the Naveego Platform from a connected Data Source. The data is then processed into a Shape where that data will go through the matching and merging processes.
Writeback – A writeback job is triggered to send Golden Record data out of the Naveego Platform and back to a connected data source. This will update records in that data source with values that may have changed during the Matching/Merging processes that created the Golden Records. Writeback jobs are often used to bring all connected data sources into alignment with the trusted ‘known-truth’ that is the result of the Naveego process.
Replication – A replication job is a job that, rather than updating an existing data source, will maintain a separate schema within a connected data source that always represents the most up to date version of each Golden Record for a particular shape. Replication jobs are often used to send Golden Record data from the Naveego Platform to Data Analytics and Warehousing tools.
A high-level diagram showing what an implementation of the Naveego Platform might look like