Analyzes customer requirements and may provide technical knowledge on cloud cluster computing techniques and technologies of moderate complexity.
Assist with the research, evaluation, and deployment of cloud computing systems and large data analytics.
May assist customers with data integration and migration efforts, including mapping schemas and defining domain specific ontologies
Assists with the development of large data technologies, algorithms, and applications.
Maintains current knowledge of relevant technology as assigned.
DESIRED QUALIFICATIONS: BA/BS (or equivalent experience), 2+ years of experience
Role & Responsibilities:
Design and implement Big Data analytic solutions on Cloudera Data Platform. Create custom analytic and data mining algorithms to help extract knowledge and meaning from vast stores of data. Refine a data processing pipeline focused on unstructured and semi-structured data refinement. Support quick turn and rapid implementations and larger scale and longer duration analytic capability implementations.
Will be designing, developing and responsible for implementation in Cloudera (CDP and SDX).
Leverage the CDP features to build the cloud-hybrid architectures (CDP Public Cloud).
Will be working as a senior developer/SME for the Hadoop enterprise data platform, specifically in Cloudera Ecosystem components such as HDFS, Sentry, HBase, Impala, Hue, Spark, Hive, Kafka, YARN, and Zookeeper.
Will be scheduling the jobs using Apache Nifi or Air flow
Will be designing Big Data/Hadoop platforms (Like Hive, HBase, Kafka, Yarn, impala etc.) handle and identify possible failure scenarios.
Will be developing components for big data platforms related to data ingestion, storage, transformations and analytics.
Will be Execution and troubleshooting Spark and Hive jobs.
Will be developing shell/Scala/Python scripts to transform the data in HDFS and automation.
Will be debugging, Configuration and tuning various components of Hadoop ecosystems as part of the development activities.
Will be importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa for the new data pipelines.
Will be able to Analyze, recommend and implement improvements to support Environment Management initiatives.
Proven core skills to ingest, transform, and process data using Apache Spark™ and core pro
Hands on experience with all the tools of Cloudera Data Platform (On Prem and On the cloud)
Experience on Cloudera Data Science Workbench and Cloudera Data Flow products.
Experience working with Cloudera Data Platform components such as
Cloudera Certified Professional (CCP) Data Engineer or Spark and Hadoop Developer
Experience with Hadoop and the HDFS Ecosystem
Strong Experience with Apache Spark, Storm, Kafka is must.
Experience with Python, R, Pig, Hive, Kafka, Knox, Tomcat and Ambari
A minimum of 4 years working with HBase/Hive/MRV1/MRV2 is required
Experience in integrating heterogeneous applications is required
Experience working with Systems Operation Department in resolving variety of infrastructure issues
Experience with Core Java, Scala, Python, R
Experience on Relational Data Base Systems (SQL) and Hierarchical data management
Experience to ETL tools such as Sqoop and Pig
Data-modeling and implementation
Experience with any machine learning or AI experience is a big plus with Python/TensorFlow experience.
We are GDIT. The people supporting some of the most complex government, defense, and intelligence projects across the country. We deliver. Bringing the expertise needed to understand and advance critical missions. We transform. Shifting the ways clients invest in, integrate, and innovate technology solutions. We ensure today is safe and tomorrow is smarter. We are there. On the ground, beside our clients, in the lab, and everywhere in between. Offering the technology transformations, strategy, and mission services needed to get the job done.
GDIT is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or veteran status, or any other protected class.