Accessing HDFS Data with gphdfs (Deprecated)
Accessing HDFS Data with gphdfs (Deprecated)
Greenplum Database leverages the parallel architecture of a Hadoop Distributed File System to read and write data files efficiently using the gphdfs protocol.
Note: The gphdfs external table protocol is deprecated and will be removed
in the next major release of Greenplum Database. Consider using the Greenplum Platform
Extension Framework (PXF) pxf external table protocol to access data
stored in a Hadoop file system.
There are three steps to using the gphdfs protocol with HDFS:
- One-time gphdfs Protocol Installation (Deprecated)
- Grant Privileges for the gphdfs Protocol (Deprecated)
- Specify gphdfs Protocol in an External Table Definition (Deprecated)
For information about using Greenplum Database external tables with Amazon EMR when Greenplum Database is installed on Amazon Web Services (AWS), also see Using Amazon EMR with Greenplum Database installed on AWS (Deprecated).