Directory Image
This website uses cookies to improve user experience. By using our website you consent to all cookies in accordance with our Privacy Policy.

Pentaho 7.0 a further installment in the quest for excellence

Author: Dhrumit Shukla
by Dhrumit Shukla
Posted: Oct 12, 2017

Pentaho offers one of the most extensive tool kits. Pentaho is an extremely comprehensive solution that handles data pipeline, from capture and integration through to analytics delivery. With the 7.0 version, the user experience is vastly enhanced, with the interface between the elements made seamless and the experience much smoother.

INTEGRATED DATA WITH PENTAHO 7.0

Businesses and organizations are on the lookout for solutions that shorten the process of data preparation and simultaneously overcome the complexity of big data. Pentaho 7.0 provides a truly integrated data preparation and business analytics platform, easing coordination between IT and the business via removing the need to change context and tools, something that no other point tools in the market deliver at present.

DATA PREPARATION ANALYTICS

Leverage Pentaho, the only platform on the market that brings analytics into data preparation, with no need to switch in and out of tools, so one could shorten the cycle from data towards insights.

  1. Bringing analytics into data prep: ETL developers and data preparation staff could check out analytics in-flight with access to graphs, charts, visualization or ad hoc analysis from any step in the process of data preparation.
  2. Share analytics during data prep: Publish data sources for the business while preparing data. By sharing data sources immediately, IT could better collaborate with the organization or business for a faster, less iterative approach to the right analytics.

NEW SPARK CAPACITIES

Organizations adopt Spark to fuel flexible, quick big data processing and analysis, but, with the shortage of relevant development skills, it could be a challenge to maximize the value of Sparks in production. The upgrades help extend the Spark benefits to a bigger audience, while enabling teams to operationalize Sparks as part of a wider data-driven business processes.

Extended Spark Orchestration: Visually coordinate and schedule Spark apps that use a bigger variety of libraries, which include Spark SQL, Spark Streaming and SparkML and SparkMLLib for machine learning. Furthermore, now Pentaho supports orchestration of Spark apps that are written in Python.

Spark on SQL access: Access SQL on Spark as a data source within data integration of Pentaho, which make it easier for ETL developers, as well as data analysts to query Spark data and incorporate it with other data for analytics and preparation.

EXPANDED SECURITY OF HADOOP

Visual development tools for big data should adhere with security frameworks that protect major enterprise data sources from intrusion. Facilitate governance of big data and minimize risk with the expanded integration of Pentaho with Hadoop security technologies.

Expanded Kerberos Integration: Promote secure integration of data for more users with updated capacity that allows numerous Pentaho users to access Kerberos-enabled Cloudera clusters as numerous Hadoop users.

Integration of Sentry: PDI works with Sentry for role-based access to particular Hadoop data sets, allowing granular tracking and enforcement of enterprise data authorization rules.

IMPROVED METADATA INJECTION

Information technology teams spends hours and hours of coding ingestion and processing jobs to onboard a wide range of big data sources. Increase IT productivity when making a lot of data migration and onboarding processes via automating and scaling big data pipelines with metadata injection.

Metadata injection extended to more steps: Allow IT teams to auto-generate a bigger array of data transformation at runtime with metadata injection support for more than 30 additional PDI steps. The new injection-enabled steps include operations, which are related to Hbase, Hadoop, JSON, Vertica, XML, Greenplum and other sources of big data.

MORE ENHANCEMENTS

Pentaho version 7.0 includes several additional improvements to future-proof the investment as well as support a big data blended world.

Support for Avro and Parquet: Output files in Avro and Parquet, two formats to store data in Hadoop in big data onboarding use scenarios.

Support for Kafka: Send data to as well as receive data from Kafka, a popular messaging queue technology that is leveraged in IoT and big data.

Simplified deployment, configuration and administration: More quickly and easily configure, deploy and manage unified data integration as well as business analytics server to support development of Pentaho and production environments.

Since other data integration vendors rely on partnerships for visualization, users still need to switch in and out of separate tools and data could not be visualized until the very of data preparation. Pentaho is the sole vendor to provide natively combined data integration with analytic and visualization capabilities. Through moving visualization upstream, users could now check out data while it is in flight and at any stage in the process of preparing data. The strategy of Pentaho for BI and DI ‘better together’ has been consistent since the early days, but the value proposition never has been stronger than today with the release of Pentaho 7.0.

Pentaho 7.0 version offers integrated data. Moreover, the user experience is greatly improved, thus the version is considered a further installment in the quest for excellence.

About the Author

Dhrumit Shukla is Business Development Manager with TatvaSoft - a custom software development company. He writes about Technology Trends, experience working with B2B and B2C clients.

Rate this Article
Leave a Comment
Author Thumbnail
I Agree:
Comment 
Pictures
Author: Dhrumit Shukla

Dhrumit Shukla

Member since: May 02, 2017
Published articles: 23

Related Articles