(They want Pivotal Hadoop-do not want generic hadoop-as they can get them)
High Level Skills – Can be staffed by same or multiple resources
Pivotal Greenplum Database
Installation Related
- Environment sizing
- Hardware module selection
- Site preparation
System Administration
- Backup management
- Indexing
- Logging
- System Profiling
- Catalogue Cleansing
General Troubleshooting
- Command Center usage for dashboards and metrics collection
- Leveraging MoreVRP for monitoring and database control
- gp_toolkit schema usage to query system state, logs, and metrics
- linux utilities for GPDB
Rudimentary DBA Tactics
- Processes for backup and restore
- Utilities for DBAs: gpstart, gpstop, gprecoverseg, gpexpand, gpcheckperf, etc.
Pivotal Extension Framework
- Concepts and usage – agile and extensible in-database framework for analytics
GPText for Analysis of Data Sets Involving Text
- Setup and Usage
- Maintenance
Data Modeling and Design
- Data Layout best practices
- Differences between OLTP and OLAP
- Design principles for successful implementations
User Defined Functions
- Concepts
- Languages
- Usage to process or operate on data
Considerations for the Larger GDPB Ecosystem
- JDBC and ODBC
- Analytics involving SAS, R, and Alpine Data Mining
- ETL and Informatica
Pivotal HD
Installation
- Environment sizing and capacity plans
- Hardware module selection
- Site preparation
System Administration
- Managing users
- Networking
- Services
General Troubleshooting
- Logging
- Command Center usage for dashboards and metrics collection
Security
- Data at rest (gazang)
- Data in motion (kerberos)
- LDAP
Pivotal Extension Framework
- Concepts and usage – agile and extensible in-database framework for analytics
Data Modeling and Design
- Data Layout best practices
- Design principles for successful implementations
HAWQ
Installation
- Environment sizing and capacity plans
- Hardware module selection
- Site preparation
System Administration
- Indexing
- Logging
- System Profiling
- Catalogue Cleansing
General Troubleshooting
- Logging
- Command Center usage for dashboards and metrics collection
Pivotal Extension Framework
- Concepts and usage – agile and extensible in-database framework for analytics
GPText for Future Analysis of Data Sets Involving Text
- Setup and Usage
- Maintenance
Data Modeling and Design
- Data Layout best practices
- Design principles for successful implementations
User Defined Functions
- Concepts
- Languages
- Current support and Roadmap
Considerations for the Larger HAWQ Ecosystem
- JDBC and ODBC
- Analytics involving SAS, R, and Alpine Data Mining
- ETL - both Informatica and Hadoop Interaction
Data Movement
Ingestion with GPDB
- Concepts
- Tools (like Gpfdist, Informatica, etc.)
Ingestion with HAWQ
- DataLoader
- PXF for analytics
Of the above skills Pivotal HD and HAWQ are important.Client is aware that 1 person may not have al the skills, present the best you can.So submit 2 if you can (then we wills ee if they pan out)
Key Skills: