Data integration and integrity

Big Data, Cloud, and IoT Integration

With the explosion of next-generation data types, traditional processes are no longer good enough. You need a progressive data management platform with embedded best practices and data integration tools -- delivered on premises, in the cloud, or through a hybrid of both -- to deliver actionable, relevant data for consumption across the enterprise.

Easier, Intelligent Hadoop™ Integration

  • One platform built for Big Data, data preparation & native engagement of Hadoop
  • Visual graphical tools to create, collaborate & control big data for faster time to market
  • Built-in progressive data ingestion - structured, unstructured & social
  • Built in Best Practices for data consistency between operational applications
  • Right-time data movement for operating at multiple speeds
  • Thousands of deployments for data management, movement, integration & BI

Key Features

Real-time transactional, batch, or changed-data (delta) streams for bi-directional data processing

Streaming data and unstructured data including Flume, Spark, and Hadoop support 

  • Data in Motion: Message queuing using Kafka & data wrangling
  • DData at rest (data in database)
  • DSqoop for Hadoop
    • Brings in metadata from database and creates it on Hadoop
    • Supports changed data capture
    • Support all of the file types Hadoop supports
    • Pipeline (usesark)
    • Supports in-pipeline transformation and data quality

Allows ingestion of data at rest from relational sources, other Hadoop sources, and NoSQL sources, as well as streaming data and data wrangling with natural language processing and integrated data quality

  • Integrative Data Quality
    • ICleansing
    • IEnriching
    • IStandardizing
    • IMatch
    • IMerge

Full transformation support across multiple data sources and targets enables rapid deployment and parallel teamwork

Provides improved human interactions to help data stewards drive the integration process

Enables full end-to-end data integration applications

Visual representation of transformation from sources such as Hadoop, Scala, and Spark with pre-defined popular data quality functions coded for faster time to market

  • Visually build data transforms from data on Hadoop
  • Builds code out in Map Reduce and Scala (can run inark)
  • Fully supports SQL aggregates and scalar functions
  • List of popular data quality functions and coded into BDI (30)
  • Transform Targets can be Hive tables, relational database tables,or IDS on-ramp formats
  • Spark Support
    • Full support of Java, Scala, Python
    • Creates build scripts
    • Manages all dependencies
    • Allows you to deploy and run on and node and debug it
    • Predictive Models, GraphX, Match and Merge, streaming, SQL
  • Supports an array of inbound and outbound protocols and available to any consuming application, system or analytics product
  • Batch, delta, and real-time transaction processing with full streaming support

Ease of design across the organization is facilitated via pre-packaged, out of the box modeling, integration, and remediation services with visualization