Share this Job

Go Back

Pivotal Hadoop (1 position) West coast need to move from one site to another


0000-00-00 00:00:00
Job Type:

(They want Pivotal Hadoop-do not want generic hadoop-as they can get them)


 


High Level Skills – Can be staffed by same or multiple resources




Pivotal Greenplum Database


 Installation Related



  • Environment sizing

  • Hardware module selection

  • Site preparation


System Administration



  • Backup management

  • Indexing

  • Logging

  • System Profiling

  • Catalogue Cleansing


General Troubleshooting



  • Command Center usage for dashboards and metrics collection

  • Leveraging MoreVRP for monitoring and database control

  • gp_toolkit schema usage to query system state, logs, and metrics

  • linux utilities for GPDB


Rudimentary DBA Tactics



  • Processes for backup and restore

  • Utilities for DBAs: gpstart, gpstop, gprecoverseg, gpexpand, gpcheckperf, etc.


Pivotal Extension Framework



  • Concepts and usage – agile and extensible in-database framework for analytics


GPText for Analysis of Data Sets Involving Text



  • Setup and Usage

  • Maintenance


Data Modeling and Design



  • Data Layout best practices

  • Differences between OLTP and OLAP

  • Design principles for successful implementations


User Defined Functions



  • Concepts

  • Languages

  • Usage to process or operate on data


Considerations for the Larger GDPB Ecosystem



  • JDBC and ODBC

  • Analytics involving SAS, R, and Alpine Data Mining

  • ETL and Informatica


 


Pivotal HD


 


Installation



  • Environment sizing and capacity plans

  • Hardware module selection

  • Site preparation


System Administration



  • Managing users

  • Networking

  • Services


General Troubleshooting



  • Logging

  • Command Center usage for dashboards and metrics collection


Security



  • Data at rest (gazang)

  • Data in motion (kerberos)

  • LDAP


Pivotal Extension Framework



  • Concepts and usage – agile and extensible in-database framework for analytics


Data Modeling and Design



  • Data Layout best practices

  • Design principles for successful implementations


 


HAWQ


Installation



  • Environment sizing and capacity plans

  • Hardware module selection

  • Site preparation


System Administration



  • Indexing

  • Logging

  • System Profiling

  • Catalogue Cleansing


General Troubleshooting



  • Logging

  • Command Center usage for dashboards and metrics collection


Pivotal Extension Framework



  • Concepts and usage – agile and extensible in-database framework for analytics


GPText for Future Analysis of Data Sets Involving Text



  • Setup and Usage

  • Maintenance


Data Modeling and Design



  • Data Layout best practices

  • Design principles for successful implementations


User Defined Functions



  • Concepts

  • Languages

  • Current support and Roadmap


Considerations for the Larger HAWQ Ecosystem



  • JDBC and ODBC

  • Analytics involving SAS, R, and Alpine Data Mining

  • ETL - both Informatica and Hadoop Interaction


 


Data Movement


Ingestion with GPDB



  • Concepts

  • Tools (like Gpfdist, Informatica, etc.)


Ingestion with HAWQ



  • DataLoader

  • PXF for analytics


 


Of the above skills Pivotal HD and HAWQ are important.Client is aware that 1 person may not have al the skills, present the best you can.So submit 2 if you can (then we wills ee if they pan out)


Key Skills: